Search results for: exogenous data
24735 An Embarrassingly Simple Semi-supervised Approach to Increase Recall in Online Shopping Domain to Match Structured Data with Unstructured Data
Authors: Sachin Nagargoje
Abstract:
Complete labeled data is often difficult to obtain in a practical scenario. Even if one manages to obtain the data, the quality of the data is always in question. In shopping vertical, offers are the input data, which is given by advertiser with or without a good quality of information. In this paper, an author investigated the possibility of using a very simple Semi-supervised learning approach to increase the recall of unhealthy offers (has badly written Offer Title or partial product details) in shopping vertical domain. The author found that the semisupervised learning method had improved the recall in the Smart Phone category by 30% on A=B testing on 10% traffic and increased the YoY (Year over Year) number of impressions per month by 33% at production. This also made a significant increase in Revenue, but that cannot be publicly disclosed.Keywords: semi-supervised learning, clustering, recall, coverage
Procedia PDF Downloads 12024734 Genodata: The Human Genome Variation Using BigData
Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta
Abstract:
Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop
Procedia PDF Downloads 25824733 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons
Authors: Said Boularouk, Didier Josselin, Eitan Altman
Abstract:
In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.Keywords: TTS, ontology, open street map, visually impaired
Procedia PDF Downloads 29524732 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks
Authors: Walid Fantazi
Abstract:
The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.Keywords: WSN, indexing data, SOA, RIA, geographic information system
Procedia PDF Downloads 25224731 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets
Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.
Abstract:
The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction
Procedia PDF Downloads 11424730 Optical Fiber Data Throughput in a Quantum Communication System
Authors: Arash Kosari, Ali Araghi
Abstract:
A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.Keywords: absorption, data throughput, depolarization, optical fiber
Procedia PDF Downloads 28424729 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network
Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi
Abstract:
Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication
Procedia PDF Downloads 44924728 Offshore Outsourcing: Global Data Privacy Controls and International Compliance Issues
Authors: Michelle J. Miller
Abstract:
In recent year, there has been a rise of two emerging issues that impact the global employment and business market that the legal community must review closer: offshore outsourcing and data privacy. These two issues intersect because employment opportunities are shifting due to offshore outsourcing and some States, like the United States, anti-outsourcing legislation has been passed or presented to retain jobs within the country. In addition, the legal requirements to retain the privacy of data as a global employer extends to employees and third party service provides, including services outsourced to offshore locations. For this reason, this paper will review the intersection of these two issues with a specific focus on data privacy.Keywords: outsourcing, data privacy, international compliance, multinational corporations
Procedia PDF Downloads 40924727 Weighted Data Replication Strategy for Data Grid Considering Economic Approach
Authors: N. Mansouri, A. Asadi
Abstract:
Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.Keywords: data grid, data replication, simulation, replica selection, replica placement
Procedia PDF Downloads 26024726 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain
Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz
Abstract:
Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.Keywords: meteosat, radar, rainfall, rain-gauge, Turkey
Procedia PDF Downloads 32524725 Spatial Data Mining by Decision Trees
Authors: Sihem Oujdi, Hafida Belbachir
Abstract:
Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining
Procedia PDF Downloads 61124724 Data-Driven Dynamic Overbooking Model for Tour Operators
Authors: Kannapha Amaruchkul
Abstract:
We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator
Procedia PDF Downloads 13124723 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria
Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu
Abstract:
The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic
Procedia PDF Downloads 44324722 Helping the Development of Public Policies with Knowledge of Criminal Data
Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno
Abstract:
The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.Keywords: social data analysis, criminal records, computational techniques, data mining, big data
Procedia PDF Downloads 8424721 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted
Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova
Abstract:
The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.Keywords: communication protocol, transmission optimization, data acquisition, system architecture
Procedia PDF Downloads 51624720 The Duty of Application and Connection Providers Regarding the Supply of Internet Protocol by Court Order in Brazil to Determine Authorship of Acts Practiced on the Internet
Authors: João Pedro Albino, Ana Cláudia Pires Ferreira de Lima
Abstract:
Humanity has undergone a transformation from the physical to the virtual world, generating an enormous amount of data on the world wide web, known as big data. Many facts that occur in the physical world or in the digital world are proven through records made on the internet, such as digital photographs, posts on social media, contract acceptances by digital platforms, email, banking, and messaging applications, among others. These data recorded on the internet have been used as evidence in judicial proceedings. The identification of internet users is essential for the security of legal relationships. This research was carried out on scientific articles and materials from courses and lectures, with an analysis of Brazilian legislation and some judicial decisions on the request of static data from logs and Internet Protocols (IPs) from application and connection providers. In this article, we will address the determination of authorship of data processing on the internet by obtaining the IP address and the appropriate judicial procedure for this purpose under Brazilian law.Keywords: IP address, digital forensics, big data, data analytics, information and communication technology
Procedia PDF Downloads 12224719 Sourcing and Compiling a Maltese Traffic Dataset MalTra
Authors: Gabriele Borg, Alexei De Bono, Charlie Abela
Abstract:
There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns
Procedia PDF Downloads 10724718 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study
Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar
Abstract:
Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices
Procedia PDF Downloads 50524717 Comparison of β-Cell Regenerative Potentials of Selected Sri Lankan Medicinal Plant Extracts in Alloxan-Induced Diabetic Rats
Authors: A. P. Attanayake, K. A. P. W. Jayatilaka, L. K. B. Mudduwa, C. Pathirana
Abstract:
Triggering of β-cell regeneration is a recognized therapeutic strategy for the treatment of type 1 diabetes mellitus. One such approach to foster restoration and regeneration of β-cells is from exogenous natural extracts. The aim of the present study was to investigate and compare the β-cell regenerative potentials of the extracts of Spondias pinnata (Linn. f.) Kurz, Coccinia grandis (L.) Voigt and Gmelina arborea Roxb. in alloxan induced diabetic rats. Wistar rats were divided in to six groups (n=6); healthy untreated rats, alloxan induced diabetic untreated rats (150 mg/kg, ip), diabetic rats receiving the extracts of S. pinnata (1.0 g/kg), C. grandis (0.75 g/kg), G. arobrea (1.00 g/kg) and diabetic rats receiving glibenclamide (0.5 mg/kg) for 30 days. The assessment of selected biochemical parameters, histopathology and immunohistochemistry in the pancreatic tissue were done on the 30th day. The reduction in the percentage of HbA1C was in the decreasing order of C. grandis (35%), G. arborea (31%) and S. pinnata (29%) in alloxan induced diabetic rats (p< 0.05). The concentration of serum fructosamine, insulin and C-peptide were decreased significantly in a decreasing order of C. grandis (30%, 72%, 51%), G. arborea (25%, 44%, 44%) and S. pinnata (27%, 34%, 24%) in alloxan induced diabetic rats (p < 0.05). The extent of β-cell regeneration was in the decreasing order of C. grandis, G. arborea, S. pinnata reflected through the increased percentage of insulin secreting β-cells in alloxan induced diabetic rats. The extract of C. grandis produced the highest degree of β-cell regeneration demonstrated through an increase in the number of islets and percentage of the insulin secreting β-cells (75%) in the pancreas of diabetic rats (p < 0.05). Further the C. grandis extract produced a significant increase in mean profile diameter in small (118%), average (10%), and large (13%) islets as compared with diabetic control rats respectively. However, statistically significant increase in the islet profile diameter was shown only in average (2%) and large (5%) islets in the G. arborea extract treated rats and large islets (5%) in S. pinnata extract treated diabetic rats (p < 0.05). The β-cell regeneration potency was in the decreasing order of C. grandis (0.75 g/kg), G. arborea (1.00 g/kg) and S. pinnata (1.00 g/kg) in alloxan induced diabetic rats. The three plant extracts may be useful as natural agents of triggering the β-cell regeneration in the management of type 1 diabetes mellitus.Keywords: alloxan-induced diabetic rats, β-cell regeneration, histopathology, immunohistochemistry
Procedia PDF Downloads 24024716 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence
Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno
Abstract:
Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index
Procedia PDF Downloads 16824715 Database Management System for Orphanages to Help Track of Orphans
Authors: Srivatsav Sanjay Sridhar, Asvitha Raja, Prathit Kalra, Soni Gupta
Abstract:
Database management is a system that keeps track of details about a person in an organisation. Not a lot of orphanages these days are shifting to a computer and program-based system, but unfortunately, most have only pen and paper-based records, which not only consumes space but it is also not eco-friendly. It comes as a hassle when one has to view a record of a person as they have to search through multiple records, and it will consume time. This program will organise all the data and can pull out any information about anyone whose data is entered. This is also a safe way of storage as physical data gets degraded over time or, worse, destroyed due to natural disasters. In this developing world, it is only smart enough to shift all data to an electronic-based storage system. The program comes with all features, including creating, inserting, searching, and deleting the data, as well as printing them.Keywords: database, orphans, programming, C⁺⁺
Procedia PDF Downloads 15324714 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join
Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel
Abstract:
Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.Keywords: map reduce, hadoop, semi join, two way join
Procedia PDF Downloads 51124713 Using Implicit Data to Improve E-Learning Systems
Authors: Slah Alsaleh
Abstract:
In the recent years and with popularity of internet and technology, e-learning became a major part of majority of education systems. One of the advantages the e-learning systems provide is the large amount of information available about the students' behavior while communicating with the e-learning system. Such information is very rich and it can be used to improve the capability and efficiency of e-learning systems. This paper discusses how e-learning can benefit from implicit data in different ways including; creating homogeneous groups of student, evaluating students' learning, creating behavior profiles for students and identifying the students through their behaviors.Keywords: e-learning, implicit data, user behavior, data mining
Procedia PDF Downloads 30524712 Enabling Quantitative Urban Sustainability Assessment with Big Data
Authors: Changfeng Fu
Abstract:
Sustainable urban development has been widely accepted a common sense in the modern urban planning and design. However, the measurement and assessment of urban sustainability, especially the quantitative assessment have been always an issue obsessing planning and design professionals. This paper will present an on-going research on the principles and technologies to develop a quantitative urban sustainability assessment principles and techniques which aim to integrate indicators, geospatial and geo-reference data, and assessment techniques together into a mechanism. It is based on the principles and techniques of geospatial analysis with GIS and statistical analysis methods. The decision-making technologies and methods such as AHP and SMART are also adopted to address overall assessment conclusions. The possible interfaces and presentation of data and quantitative assessment results are also described. This research is based on the knowledge, situations and data sources of UK, but it is potentially adaptable to other countries or regions. The implementation potentials of the mechanism are also discussed.Keywords: urban sustainability assessment, quantitative analysis, sustainability indicator, geospatial data, big data
Procedia PDF Downloads 35424711 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin
Authors: A. Ishag Mohamed, A. A. Rabah
Abstract:
The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.Keywords: N-Alkanes, N-Alkenes, nonparametric, regression
Procedia PDF Downloads 65224710 Survey on Arabic Sentiment Analysis in Twitter
Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb
Abstract:
Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.Keywords: big data, social networks, sentiment analysis, twitter
Procedia PDF Downloads 57524709 Estimating Current Suicide Rates Using Google Trends
Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis
Abstract:
Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.Keywords: nowcasting, search data, Google Trends, official statistics
Procedia PDF Downloads 35524708 On the Network Packet Loss Tolerance of SVM Based Activity Recognition
Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir
Abstract:
In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss
Procedia PDF Downloads 47524707 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman electricity Transmission Company
Authors: Rahma Saleh Hussein Al Balushi
Abstract:
Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS department. This paper will describe in detail the current GIS data submission process and the journey for developing it. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, and updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) for excavation permits and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting and data alterations has also contributed to reducing the missing attributes and enhance data quality index of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the years 2017 and year 2022. Overall, concluding that by governance, asset information & GIS department can control the GIS data process; collect, properly record, and manage asset data and information within the OETC network. This control extends to other applications and systems integrated with/related to GIS systems.Keywords: asset management ISO55001, standard procedures process, governance, CMMS
Procedia PDF Downloads 12324706 Efects of Data Corelation in a Sparse-View Compresive Sensing Based Image Reconstruction
Authors: Sajid Abas, Jon Pyo Hong, Jung-Ryun Le, Seungryong Cho
Abstract:
Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.Keywords: computed tomography, computed laminography, compressive sending, low-dose
Procedia PDF Downloads 463