Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 26447

Search results for: spatial data mining

26177 Smart in Performance: More to Practical Life than Hardware and Software

Abstract:

This paper promotes the importance of focusing on spatial aspects and affective factors that impact smart urbanism. This helps to better inform city governance, spatial planning, and policymaking to focus on what Smart does and what it can achieve for cities in terms of performance rather than on using the notion for prestige in a worldwide trend towards becoming a smart city. By illustrating how this style of practice compromises the social aspects and related elements of space making through an interdisciplinary comparative approach, the paper clarifies the impact of this compromise on the overall smart city performance. In response, this paper recognizes the importance of establishing a new meaning for urban progress by moving beyond improving basic services of the city to enhance the actual human experience which is essential for the development of authentic smart cities. The topic is presented under five overlooked areas that discuss the relation between smart cities’ potential and efficiency paradox, the social aspect, connectedness with nature, the human factor, and untapped resources. However, these themes are not meant to be discussed in silos, instead, they are presented to collectively examine smart cities in performance, arguing there is more to the practical life of smart cities than software and hardware inventions. The study is based on a case study approach, presenting Milton Keynes as a living example to learn from while engaging with various methods for data collection including multi-disciplinary semi-structured interviews, field observations, and data mining.

Keywords: smart design, the human in the city, human needs and urban planning, sustainability, smart cities, smart

Procedia PDF Downloads 93

26176 Geostatistical Analysis of Contamination of Soils in an Urban Area in Ghana

Authors: S. K. Appiah, E. N. Aidoo, D. Asamoah Owusu, M. W. Nuonabuor

Abstract:

Urbanization remains one of the unique predominant factors which is linked to the destruction of urban environment and its associated cases of soil contamination by heavy metals through the natural and anthropogenic activities. These activities are important sources of toxic heavy metals such as arsenic (As), cadmium (Cd), chromium (Cr), copper (Cu), iron (Fe), manganese (Mn), and lead (Pb), nickel (Ni) and zinc (Zn). Often, these heavy metals lead to increased levels in some areas due to the impact of atmospheric deposition caused by their proximity to industrial plants or the indiscriminately burning of substances. Information gathered on potentially hazardous levels of these heavy metals in soils leads to establish serious health and urban agriculture implications. However, characterization of spatial variations of soil contamination by heavy metals in Ghana is limited. Kumasi is a Metropolitan city in Ghana, West Africa and is challenged with the recent spate of deteriorating soil quality due to rapid economic development and other human activities such as “Galamsey”, illegal mining operations within the metropolis. The paper seeks to use both univariate and multivariate geostatistical techniques to assess the spatial distribution of heavy metals in soils and the potential risk associated with ingestion of sources of soil contamination in the Metropolis. Geostatistical tools have the ability to detect changes in correlation structure and how a good knowledge of the study area can help to explain the different scales of variation detected. To achieve this task, point referenced data on heavy metals measured from topsoil samples in a previous study, were collected at various locations. Linear models of regionalisation and coregionalisation were fitted to all experimental semivariograms to describe the spatial dependence between the topsoil heavy metals at different spatial scales, which led to ordinary kriging and cokriging at unsampled locations and production of risk maps of soil contamination by these heavy metals. Results obtained from both the univariate and multivariate semivariogram models showed strong spatial dependence with range of autocorrelations ranging from 100 to 300 meters. The risk maps produced show strong spatial heterogeneity for almost all the soil heavy metals with extremely risk of contamination found close to areas with commercial and industrial activities. Hence, ongoing pollution interventions should be geared towards these highly risk areas for efficient management of soil contamination to avert further pollution in the metropolis.

Keywords: coregionalization, heavy metals, multivariate geostatistical analysis, soil contamination, spatial distribution

Procedia PDF Downloads 292

26175 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)

Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang

Abstract:

This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.

Keywords: decision tree, data mining, customers, life insurance pay package

Procedia PDF Downloads 422

26174 Gold, Power, Protest, Examining How Digital Media and PGIS are Used to Protest the Mining Industry in Colombia

Authors: Doug Specht

Abstract:

This research project sought to explore the links between digital media, PGIS and social movement organisations in Tolima, Colombia. The primary aim of the research was to examine how knowledge is created and disseminated through digital media and GIS in the region, and whether there exists the infrastructure to allow for this. The second strand was to ascertain if this has had a significant impact on the way grassroots movements work and produce collective actions. The third element is a hypothesis about how digital media and PGIS could play a larger role in activist activities, particularly in reference to the extractive industries. Three theoretical strands have been brought together to provide a basis for this research, namely (a) the politics of knowledge, (b) spatial management and inclusion, and (c) digital media and political engagement. Quantitative data relating to digital media and mobile internet use was collated alongside qualitative data relating to the likelihood of using digital media in activist campaigns, with particular attention being given to grassroots movements working against extractive industries in the Tolima region of Colombia. Through interviews, surveys and GIS analysis it has been possible to build a picture of online activism and the role of PPGIS within protest movement in the region of Tolima, Colombia. Results show a gap between the desires of social movements to use digital media and the skills and finances required to implement programs that utilise it. Maps and GIS are generally reserved for legal cases rather than for informing the lay person. However, it became apparent that the combination of digital/social media and PPGIS could play a significant role in supporting the work of grassroots movements.

Keywords: PGIS, GIS, social media, digital media, mining, colombia, social movements, protest

Procedia PDF Downloads 423

26173 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 514

26172 Study on the Layout of 15-Minute Community-Life Circle in the State of “Community Segregation” Based on Poi: Shengwei Community and Other Two Communities in Chongqing

Authors: Siyuan Cai

Abstract:

This paper takes community segregation during major infectious diseases as the background, based on the physiological needs and safety needs of citizens during home segregation, and based on the selection of convenient facilities and medical facilities as the main research objects. Based on the POI data of public facilities in Chongqing, the spatial distribution characteristics of the convenience and medical facilities in the 15-minute living circle centered on three neighborhoods in Shapingba, namely Shengwei Community, Anju Commmunity and Fengtian Garden Community, were explored by means of GIS spatial analysis. The results show that the spatial distribution of convenience and medical facilities in this area has significant clustering characteristics, with a point-like distribution pattern of "dense in the west and sparse in the east", and a grouped and multi-polar spatial structure. The spatial structure is multi-polar and has an obvious tendency to the intersections and residential areas with dense pedestrian flow. This study provides a preliminary exploration of the distribution of medical and convenience facilities within the 15-minute living circle of a segregated community, which makes up for the lack of spatial research in this area.

Keywords: ArcGIS, community segregation, convenient facilities; distribution pattern, medical facilities, POI, 15-minute community life circle

Procedia PDF Downloads 113

26171 Mood Recognition Using Indian Music

Authors: Vishwa Joshi

Abstract:

The study of mood recognition in the field of music has gained a lot of momentum in the recent years with machine learning and data mining techniques and many audio features contributing considerably to analyze and identify the relation of mood plus music. In this paper we consider the same idea forward and come up with making an effort to build a system for automatic recognition of mood underlying the audio song’s clips by mining their audio features and have evaluated several data classification algorithms in order to learn, train and test the model describing the moods of these audio songs and developed an open source framework. Before classification, Preprocessing and Feature Extraction phase is necessary for removing noise and gathering features respectively.

Keywords: music, mood, features, classification

Procedia PDF Downloads 491

26170 Tactile Cues and Spatial Navigation in Mice

Authors: Rubaiyea Uddin

Abstract:

The hippocampus, located in the limbic system, is most commonly known for its role in memory and spatial navigation (as cited in Brain Reward and Pathways). It maintains an especially important role in specifically episodic and declarative memory. The hippocampus has also recently been linked to dopamine, the reward pathway’s primary neurotransmitter. Since research has found that dopamine also contributes to memory consolidation and hippocampal plasticity, this neurotransmitter is potentially responsible for contributing to the hippocampus’s role in memory formation. In this experiment we tested to see the effect of tactile cues on spatial navigation for eight different mice. We used a radial arm that had one designated 'reward' arm containing sucrose. The presence or absence of bedding was our tactile cue. We attempted to see if the memory of that cue would enhance the mice’s memory of having received the reward in that arm. The results from our study showed there was no significant response from the use of tactile cues on spatial navigation on our 129 mice. Tactile cues therefore do not influence spatial navigation.

Keywords: mice, radial arm maze, memory, spatial navigation, tactile cues, hippocampus, reward, sensory skills, Alzheimer’s, neurodegnerative disease

Procedia PDF Downloads 642

26169 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 227

26168 Managing Data from One Hundred Thousand Internet of Things Devices Globally for Mining Insights

Authors: Julian Wise

Abstract:

Newcrest Mining is one of the world’s top five gold and rare earth mining organizations by production, reserves and market capitalization in the world. This paper elaborates on the data acquisition processes employed by Newcrest in collaboration with Fortune 500 listed organization, Insight Enterprises, to standardize machine learning solutions which process data from over a hundred thousand distributed Internet of Things (IoT) devices located at mine sites globally. Through the utilization of software architecture cloud technologies and edge computing, the technological developments enable for standardized processes of machine learning applications to influence the strategic optimization of mineral processing. Target objectives of the machine learning optimizations include time savings on mineral processing, production efficiencies, risk identification, and increased production throughput. The data acquired and utilized for predictive modelling is processed through edge computing by resources collectively stored within a data lake. Being involved in the digital transformation has necessitated the standardization software architecture to manage the machine learning models submitted by vendors, to ensure effective automation and continuous improvements to the mineral process models. Operating at scale, the system processes hundreds of gigabytes of data per day from distributed mine sites across the globe, for the purposes of increased improved worker safety, and production efficiency through big data applications.

Keywords: mineral technology, big data, machine learning operations, data lake

Procedia PDF Downloads 105

26167 Social Media Mining with R. Twitter Analyses

Authors: Diana Codat

Abstract:

Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.

Keywords: data mining, language R, social networks, Twitter

Procedia PDF Downloads 179

26166 Individual Differences and Paired Learning in Virtual Environments

Authors: Patricia M. Boechler, Heather M. Gautreau

Abstract:

In this research study, postsecondary students completed an information learning task in an avatar-based 3D virtual learning environment. Three factors were of interest in relation to learning; 1) the influence of collaborative vs. independent conditions, 2) the influence of the spatial arrangement of the virtual environment (linear, random and clustered), and 3) the relationship of individual differences such as spatial skill, general computer experience and video game experience to learning. Students completed pretest measures of prior computer experience and prior spatial skill. Following the premeasure administration, students were given instruction to move through the virtual environment and study all the material within 10 information stations. In the collaborative condition, students proceeded in randomly assigned pairs, while in the independent condition they proceeded alone. After this learning phase, all students individually completed a multiple choice test to determine information retention. The overall results indicated that students in pairs did not perform any better or worse than independent students. As far as individual differences, only spatial ability predicted the performance of students. General computer experience and video game experience did not. Taking a closer look at the pairs and spatial ability, comparisons were made on pairs high/matched spatial ability, pairs low/matched spatial ability and pairs that were mismatched on spatial ability. The results showed that both high/matched pairs and mismatched pairs outperformed low/matched pairs. That is, if a pair had even one individual with strong spatial ability they would perform better than pairs with only low spatial ability individuals. This suggests that, in virtual environments, the specific individuals that are paired together are important for performance outcomes. The paper also includes a discussion of trends within the data that have implications for virtual environment education.

Keywords: avatar-based, virtual environment, paired learning, individual differences

Procedia PDF Downloads 111

26165 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 77

26164 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 398

26163 Spatial Disparity in Education and Medical Facilities: A Case Study of Barddhaman District, West Bengal, India

Authors: Amit Bhattacharyya

Abstract:

The economic scenario of any region does not show the real picture for the measurement of overall development. Therefore, economic development must be accompanied by social development to be able to make an assessment to measure the level of development. The spatial variation with respect to social development has been discussed taking into account the quality of functioning of a social system in a specific area. In this paper, an attempt has been made to study the spatial distribution of social infrastructural facilities and analyze the magnitude of regional disparities at inter- block level in Barddhman district. It starts with the detailed account of the selection process of social infrastructure indicators and describes the methodology employed in the empirical analysis. Analyzing the block level data, this paper tries to identify the disparity among the blocks in the levels of social development. The results have been subsequently explained using both statistical analysis and geo spatial technique. The paper reveals that the social development is not going on at the same rate in every part of the district. Health facilities and educational facilities are concentrated at some selected point. So overall development activities come to be concentrated in a few centres and the disparity is seen over the blocks.

Keywords: disparity, inter-block, social development, spatial variation

Procedia PDF Downloads 165

26162 Intelligent Process Data Mining for Monitoring for Fault-Free Operation of Industrial Processes

Authors: Hyun-Woo Cho

Abstract:

The real-time fault monitoring and diagnosis of large scale production processes is helpful and necessary in order to operate industrial process safely and efficiently producing good final product quality. Unusual and abnormal events of the process may have a serious impact on the process such as malfunctions or breakdowns. This work try to utilize process measurement data obtained in an on-line basis for the safe and some fault-free operation of industrial processes. To this end, this work evaluated the proposed intelligent process data monitoring framework based on a simulation process. The monitoring scheme extracts the fault pattern in the reduced space for the reliable data representation. Moreover, this work shows the results of using linear and nonlinear techniques for the monitoring purpose. It has shown that the nonlinear technique produced more reliable monitoring results and outperforms linear methods. The adoption of the qualitative monitoring model helps to reduce the sensitivity of the fault pattern to noise.

Keywords: process data, data mining, process operation, real-time monitoring

Procedia PDF Downloads 633

26161 Application of Knowledge Discovery in Database Techniques in Cost Overruns of Construction Projects

Authors: Mai Ghazal, Ahmed Hammad

Abstract:

Cost overruns in construction projects are considered as worldwide challenges since the cost performance is one of the main measures of success along with schedule performance. To overcome this problem, studies were conducted to investigate the cost overruns' factors, also projects' historical data were analyzed to extract new and useful knowledge from it. This research is studying and analyzing the effect of some factors causing cost overruns using the historical data from completed construction projects. Then, using these factors to estimate the probability of cost overrun occurrence and predict its percentage for future projects. First, an intensive literature review was done to study all the factors that cause cost overrun in construction projects, then another review was done for previous researcher papers about mining process in dealing with cost overruns. Second, a proposed data warehouse was structured which can be used by organizations to store their future data in a well-organized way so it can be easily analyzed later. Third twelve quantitative factors which their data are frequently available at construction projects were selected to be the analyzed factors and suggested predictors for the proposed model.

Keywords: construction management, construction projects, cost overrun, cost performance, data mining, data warehousing, knowledge discovery, knowledge management

Procedia PDF Downloads 365

26160 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 427

26159 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain

Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz

Abstract:

Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.

Keywords: meteosat, radar, rainfall, rain-gauge, Turkey

Procedia PDF Downloads 321

26158 A Study of Soil Heavy Metal Pollution in the Manganese Mining in Drama, Greece

Authors: A. Argiri, A. Molla, Tzouvalekas, E. Skoufogianni, N. Danalatos

Abstract:

The release of heavy metals into the environment has increased over the last years. In this study, 25 soil samples (0-15 cm) from the fields near the mining area in Drama region were selected. The samples were analyzed in the laboratory for their physicochemical properties and for seven “pseudo-total’’ heavy metals content, namely Pb, Zn, Cd, Cr, Cu, Ni, and Mn. The total metal concentrations (Pb, Zn, Cd, Cr, Cu, Ni and Mn) in digests were determined by using the atomic absorption spectrophotometer. According to the results, the mean concentration of the listed heavy metals in 25 soil samples are Cd 1.1 mg/kg, Cr 15 mg/kg, Cu 21.7 mg/kg, Ni 30.1 mg/kg, Pd 50.8 mg/kg, Zn 99.5 mg/kg and Mn 815.3 mg/kg. The results show that the heavy metals remain in the soil even if the mining closed many years ago.

Keywords: Greece, heavy metals, mining, pollution

Procedia PDF Downloads 121

26157 Focus-Latent Dirichlet Allocation for Aspect-Level Opinion Mining

Authors: Mohsen Farhadloo, Majid Farhadloo

Abstract:

Aspect-level opinion mining that aims at discovering aspects (aspect identification) and their corresponding ratings (sentiment identification) from customer reviews have increasingly attracted attention of researchers and practitioners as it provides valuable insights about products/services from customer's points of view. Instead of addressing aspect identification and sentiment identification in two separate steps, it is possible to simultaneously identify both aspects and sentiments. In recent years many graphical models based on Latent Dirichlet Allocation (LDA) have been proposed to solve both aspect and sentiment identifications in a single step. Although LDA models have been effective tools for the statistical analysis of document collections, they also have shortcomings in addressing some unique characteristics of opinion mining. Our goal in this paper is to address one of the limitations of topic models to date; that is, they fail to directly model the associations among topics. Indeed in many text corpora, it is natural to expect that subsets of the latent topics have higher probabilities. We propose a probabilistic graphical model called focus-LDA, to better capture the associations among topics when applied to aspect-level opinion mining. Our experiments on real-life data sets demonstrate the improved effectiveness of the focus-LDA model in terms of the accuracy of the predictive distributions over held out documents. Furthermore, we demonstrate qualitatively that the focus-LDA topic model provides a natural way of visualizing and exploring unstructured collection of textual data.

Keywords: aspect-level opinion mining, document modeling, Latent Dirichlet Allocation, LDA, sentiment analysis

Procedia PDF Downloads 91

26156 Multi-Scale Urban Spatial Evolution Analysis Based on Space Syntax: A Case Study in Modern Yangzhou, China

Authors: Dai Zhimei, Hua Chen

Abstract:

The exploration of urban spatial evolution is an important part of urban development research. Therefore, the evolutionary modern Yangzhou urban spatial texture was taken as the research object, and Spatial Syntax was used as the main research tool, this paper explored Yangzhou spatial evolution law and its driving factors from the urban street network scale, district scale and street scale. The study has concluded that at the urban scale, Yangzhou urban spatial evolution is the result of a variety of causes, including physical and geographical condition, policy and planning factors, and traffic conditions, and the evolution of space also has an impact on social, economic, environmental and cultural factors. At the district and street scales, changes in space will have a profound influence on the history of the city and the activities of people. At the end of the article, the matters needing attention during the evolution of urban space were summarized.

Keywords: block, space syntax and methodology, street, urban space, Yangzhou

Procedia PDF Downloads 174

26155 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 71

26154 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 132

26153 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: data mining, k-means, road traffic accidents, Waze, Weka

Procedia PDF Downloads 404

26152 Spatial Time Series Models for Rice and Cassava Yields Based on Bayesian Linear Mixed Models

Authors: Panudet Saengseedam, Nanthachai Kantanantha

Abstract:

This paper proposes a linear mixed model (LMM) with spatial effects to forecast rice and cassava yields in Thailand at the same time. A multivariate conditional autoregressive (MCAR) model is assumed to present the spatial effects. A Bayesian method is used for parameter estimation via Gibbs sampling Markov Chain Monte Carlo (MCMC). The model is applied to the rice and cassava yields monthly data which have been extracted from the Office of Agricultural Economics, Ministry of Agriculture and Cooperatives of Thailand. The results show that the proposed model has better performance in most provinces in both fitting part and validation part compared to the simple exponential smoothing and conditional auto regressive models (CAR) from our previous study.

Keywords: Bayesian method, linear mixed model, multivariate conditional autoregressive model, spatial time series

Procedia PDF Downloads 388

26151 Quantification of GHGs Emissions from Electricity and Diesel Fuel Consumption in Basalt Mining Industry in Thailand

Authors: S. Kittipongvises, A. Dubsok

Abstract:

The mineral and mining industry is necessary for countries to have an adequate and reliable supply of materials to meet their socio-economic development. Despite its importance, the environmental impacts from mineral exploration are hugely significant. This study aimed to investigate and quantify the amount of GHGs emissions emitted from both electricity and diesel vehicle fuel consumption in basalt mining in Thailand. Plant A, located in the northeastern region of Thailand, was selected as a case study. Results indicated that total GHGs emissions from basalt mining and operation (Plant A) were approximately 2,501,086 kgCO₂e and 1,997,412 kgCO₂e in 2014 and 2015, respectively. The estimated carbon intensity ranged between 1.824 kgCO₂e to 2.284 kgCO₂e per ton of rock product. Scope 1 (direct emissions) was the dominant driver of its total GHGs compared to scope 2 (indirect emissions). As such, transport related combustion of diesel fuels generated the highest GHGs emission (65%) compared to emissions from purchased electricity (35%). Some of the potential implications for mining entities were also presented.

Keywords: basalt mining, diesel fuel, electricity, GHGs emissions, Thailand

Procedia PDF Downloads 260

26150 A General Strategy for Noise Assessment in Open Mining Industries

Authors: Diego Mauricio Murillo Gomez, Enney Leon Gonzalez Ramirez, Hugo Piedrahita, Jairo Yate

Abstract:

This paper proposes a methodology for the management of noise in open mining industries based on an integral concept, which takes into consideration occupational and environmental noise as a whole. The approach relies on the characterization of sources, the combination of several measurements’ techniques and the use of acoustic prediction software. A discussion about the difference between frequently used acoustic indicators such as Leq and LAV is carried out, aiming to establish common ground for homologation. The results show that the correct integration of this data not only allows for a more robust technical analysis but also for a more strategic route of intervention as several departments of the company are working together. Noise control measurements can be designed to provide a healthy acoustic surrounding in which the exposure workers but also the outdoor community is benefited.

Keywords: environmental noise, noise control, occupational noise, open mining

Procedia PDF Downloads 257

26149 Landscape Classification in North of Jordan by Integrated Approach of Remote Sensing and Geographic Information Systems

Authors: Taleb Odeh, Nizar Abu-Jaber, Nour Khries

Abstract:

The southern part of Wadi Al Yarmouk catchment area covers north of Jordan. It locates within latitudes 32° 20’ to 32° 45’N and longitudes 35° 42’ to 36° 23’ E and has an area of about 1426 km2. However, it has high relief topography where the elevation varies between 50 to 1100 meter above sea level. The variations in the topography causes different units of landforms, climatic zones, land covers and plant species. As a results of these different landscapes units exists in that region. Spatial planning is a major challenge in such a vital area for Jordan which could not be achieved without determining landscape units. However, an integrated approach of remote sensing and geographic information Systems (GIS) is an optimized tool to investigate and map landscape units of such a complicated area. Remote sensing has the capability to collect different land surface data, of large landscape areas, accurately and in different time periods. GIS has the ability of storage these land surface data, analyzing them spatially and present them in form of professional maps. We generated a geo-land surface data that include land cover, rock units, soil units, plant species and digital elevation model using ASTER image and Google Earth while analyzing geo-data spatially were done by ArcGIS 10.2 software. We found that there are twenty two different landscape units in the study area which they have to be considered for any spatial planning in order to avoid and environmental problems.

Keywords: landscape, spatial planning, GIS, spatial analysis, remote sensing

Procedia PDF Downloads 522

26148 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 469