Search results for: distributed data stream mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26098

Search results for: distributed data stream mining

25828 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 377
25827 Mood Recognition Using Indian Music

Authors: Vishwa Joshi

Abstract:

The study of mood recognition in the field of music has gained a lot of momentum in the recent years with machine learning and data mining techniques and many audio features contributing considerably to analyze and identify the relation of mood plus music. In this paper we consider the same idea forward and come up with making an effort to build a system for automatic recognition of mood underlying the audio song’s clips by mining their audio features and have evaluated several data classification algorithms in order to learn, train and test the model describing the moods of these audio songs and developed an open source framework. Before classification, Preprocessing and Feature Extraction phase is necessary for removing noise and gathering features respectively.

Keywords: music, mood, features, classification

Procedia PDF Downloads 471
25826 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 202
25825 Social Media Mining with R. Twitter Analyses

Authors: Diana Codat

Abstract:

Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.

Keywords: data mining, language R, social networks, Twitter

Procedia PDF Downloads 147
25824 Improving Grade Control Turnaround Times with In-Pit Hyperspectral Assaying

Authors: Gary Pattemore, Michael Edgar, Andrew Job, Marina Auad, Kathryn Job

Abstract:

As critical commodities become more scarce, significant time and resources have been used to better understand complicated ore bodies and extract their full potential. These challenging ore bodies provide several pain points for geologists and engineers to overcome, poor handling of these issues flows downs stream to the processing plant affecting throughput rates and recovery. Many open cut mines utilise blast hole drilling to extract additional information to feed back into the modelling process. This method requires samples to be collected during or after blast hole drilling. Samples are then sent for assay with turnaround times varying from 1 to 12 days. This method is time consuming, costly, requires human exposure on the bench and collects elemental data only. To address this challenge, research has been undertaken to utilise hyperspectral imaging across a broad spectrum to scan samples, collars or take down hole measurements for minerals and moisture content and grade abundances. Automation of this process using unmanned vehicles and on-board processing reduces human in pit exposure to ensure ongoing safety. On-board processing allows data to be integrated into modelling workflows with immediacy. The preliminary results demonstrate numerous direct and indirect benefits from this new technology, including rapid and accurate grade estimates, moisture content and mineralogy. These benefits allow for faster geo modelling updates, better informed mine scheduling and improved downstream blending and processing practices. The paper presents recommendations for implementation of the technology in open cut mining environments.

Keywords: grade control, hyperspectral scanning, artificial intelligence, autonomous mining, machine learning

Procedia PDF Downloads 78
25823 Improved FP-Growth Algorithm with Multiple Minimum Supports Using Maximum Constraints

Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam

Abstract:

Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FP-growth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.

Keywords: association rules, FP-growth, multiple minimum supports, Weka tool

Procedia PDF Downloads 446
25822 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 37
25821 Analysis of Changes Being Done of the Mine Legislation of Turkey: Mining Operation Activity Process

Authors: Taşkın Deniz Yıldız, Mustafa Topaloğlu, Orhan Kural

Abstract:

The right to operate a fairly long periods of prior periods and after the 3213 Mining Law has been observed to be shortened in Turkey. Permit the realization of business activities (or concession) requested the purchase of the mine operated "found mine" position, as well as the financial and technical capability to have the owner of the right to operate the mines as well as the principle of equality is important in terms of assessing the best way be. In particular, in this context, license fields "negligence" (downsizing) have noted that the current arrangement for all periods. However, in the period after 3213 Mining Act and a permit to operate more effectively within the framework of implementation of negligence is laid down.

Keywords: mining legislation, operation, permit, Turkey

Procedia PDF Downloads 373
25820 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 398
25819 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 443
25818 Ground Water Pollution Investigation around Çorum Stream Basin in Turkey

Authors: Halil Bas, Unal Demiray, Sukru Dursun

Abstract:

Water and ground water pollution at the most of the countries is important problem. Investigation of water pollution source must be carried out to save fresh water. Because fresh water sources are very limited and recent sources are not enough for increasing population of world. In this study, investigation was carried out on pollution factors effecting the quality of the groundwater in Çorum Stream Basin in Turkey. Effect of geological structure of the region and the interaction between the stream and groundwater was researched. For the investigation, stream and groundwater sampling were performed at rainy and dry seasons to see if there is a change on quality parameters. The results were evaluated by the computer programs and then graphics, distribution maps were prepared. Thus, degree of the quality and pollution were tried to understand. According to analysis results, because the results of streams and the ground waters are not so close to each other we can say that there is no interaction between the stream and the groundwater. As the irrigation water, the stream waters are generally in the range between C3S1 region and the ground waters are generally in the range between C3S1 and C4S2 regions according to US Salinity Laboratory Diagram. According to Wilcox diagram stream waters are generally good-permissible and ground waters are generally good permissible, doubtful to unsuitable and unsuitable type. Especially ground waters are doubtful to unsuitable and unsuitable types in dry season. It may be assumed that as the result of relative increase in concentration of salt minerals. Especially samples from groundwater wells bored close to gypsium bearing units have high hardness, electrical conductivity and salinity values. Thus for drinking and irrigation these waters are determined as unsuitable. As a result of these studies, it is understood that the groundwater especially was effected by the lithological contamination rather than the anthropogenic or the other types of pollution. Because the alluvium is covered by the silt and clay lithology it is not affected by the anthropogenic and the other foreign factors. The results of solid waste disposal site leachate indicate that this site would have a risk potential for pollution in the future. Although the parameters did not exceed the maximum dangerous values it does not mean that they will not be dangerous in the future, and this case must be taken into account.

Keywords: Çorum, environment, groundwater, hydrogeology, geology, pollution, quality, stream

Procedia PDF Downloads 467
25817 Leveraging Power BI for Advanced Geotechnical Data Analysis and Visualization in Mining Projects

Authors: Elaheh Talebi, Fariba Yavari, Lucy Philip, Lesley Town

Abstract:

The mining industry generates vast amounts of data, necessitating robust data management systems and advanced analytics tools to achieve better decision-making processes in the development of mining production and maintaining safety. This paper highlights the advantages of Power BI, a powerful intelligence tool, over traditional Excel-based approaches for effectively managing and harnessing mining data. Power BI enables professionals to connect and integrate multiple data sources, ensuring real-time access to up-to-date information. Its interactive visualizations and dashboards offer an intuitive interface for exploring and analyzing geotechnical data. Advanced analytics is a collection of data analysis techniques to improve decision-making. Leveraging some of the most complex techniques in data science, advanced analytics is used to do everything from detecting data errors and ensuring data accuracy to directing the development of future project phases. However, while Power BI is a robust tool, specific visualizations required by geotechnical engineers may have limitations. This paper studies the capability to use Python or R programming within the Power BI dashboard to enable advanced analytics, additional functionalities, and customized visualizations. This dashboard provides comprehensive tools for analyzing and visualizing key geotechnical data metrics, including spatial representation on maps, field and lab test results, and subsurface rock and soil characteristics. Advanced visualizations like borehole logs and Stereonet were implemented using Python programming within the Power BI dashboard, enhancing the understanding and communication of geotechnical information. Moreover, the dashboard's flexibility allows for the incorporation of additional data and visualizations based on the project scope and available data, such as pit design, rock fall analyses, rock mass characterization, and drone data. This further enhances the dashboard's usefulness in future projects, including operation, development, closure, and rehabilitation phases. Additionally, this helps in minimizing the necessity of utilizing multiple software programs in projects. This geotechnical dashboard in Power BI serves as a user-friendly solution for analyzing, visualizing, and communicating both new and historical geotechnical data, aiding in informed decision-making and efficient project management throughout various project stages. Its ability to generate dynamic reports and share them with clients in a collaborative manner further enhances decision-making processes and facilitates effective communication within geotechnical projects in the mining industry.

Keywords: geotechnical data analysis, power BI, visualization, decision-making, mining industry

Procedia PDF Downloads 49
25816 An Analysis on Gravel of Sand-Gravel Bar at Gneiss or Granite Area of the Upper Hongcheon River in South Korea

Authors: Man Kyu Kim, Hansu Shin

Abstract:

This study is an analysis on gravel of sand-gravel bar that stretches variously in the Duchon and Naechon stream basins, which are situated on Hong-Cheon River (a well-developed sand-gravel bar in upstream river) basins in Korea. Naechon stream mostly flows through granite zone but Duchon stream mostly flows through gneiss zone. The characteristics of gravel in the sand-gravel bar of these two branches in the upper Hongcheon River were analyzed in this study in order to understand the geomorphic development of streams depending on the differences of bedrock. Through the analysis on the roundness and flatness of gravel, we figured out an irregular trend following the increase in supply of granite gravel and gneiss gravel as we traveled downstream. The result shows that the two basins have uppermost small basin condition reflecting the mountain valley environment although it may be difficult to do an equivalent comparison to other roundness researches in Korea or in Europe. This study conducted an analysis on gravels found in small scale streams unlike the previous studies trend which mostly studies large rivers. The research provides an opportunity to offer basic data for continuous comparison research on various small basins.

Keywords: flatness, geology, roundness, sand-gravel bar

Procedia PDF Downloads 333
25815 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: data mining, k-means, road traffic accidents, Waze, Weka

Procedia PDF Downloads 375
25814 Hydraulic Characteristics of the Tidal River Dongcheon in Busan City

Authors: Young Man Cho, Sang Hyun Kim

Abstract:

Even though various management practices such as sediment dredging were attempted to improve water quality of Dongcheon located in Busan, the environmental condition of this stream was deteriorated. Therefore, Busan metropolitan city had pumped and diverted sea water to upstream of Dongcheon for several years. This study explored hydraulic characteristics of Dongcheon to configure the best management practice for ecological restoration and water quality improvement of a man-made urban stream. Intensive field investigation indicates that average flow velocities at depths of 20% and 80% from the water surface ranged 5 to 10 cm/s and 2 to 5 cm/s, respectively. Concentrations of dissolved oxygen for all depths were less than 0.25 mg/l during low tidal period. Even though density difference can be found along stream depth, density current seems rarely generated in Dongcheon. Short period of high tidal portion and shallow depths are responsible for well-mixing nature of Doncheon.

Keywords: hydraulic, tidal river, density current, sea water

Procedia PDF Downloads 186
25813 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 106
25812 The Results of Longitudinal Water Quality Monitoring of the Brandywine River, Chester County, Pennsylvania by High School Students

Authors: Dina L. DiSantis

Abstract:

Strengthening a sense of responsibility while relating global sustainability concepts such as water quality and pollution to a local water system can be achieved by teaching students to conduct and interpret water quality monitoring tests. When students conduct their own research, they become better stewards of the environment. Providing outdoor learning and place-based opportunities for students helps connect them to the natural world. By conducting stream studies and collecting data, students are able to better understand how the natural environment is a place where everything is connected. Students have been collecting physical, chemical and biological data along the West and East Branches of the Brandywine River, in Pennsylvania for over ten years. The stream studies are part of the advanced placement environmental science and aquatic science courses that are offered as electives to juniors and seniors at the Downingtown High School West Campus in Downingtown, Pennsylvania. Physical data collected includes: temperature, turbidity, width, depth, velocity, and volume of flow or discharge. The chemical tests conducted are: dissolved oxygen, carbon dioxide, pH, nitrates, alkalinity and phosphates. Macroinvertebrates are collected with a kick net, identified and then released. Students collect the data from several locations while traveling by canoe. In the classroom, students prepare a water quality data analysis and interpretation report based on their collected data. The summary of the results from longitudinal water quality data collection by students, as well as the strengths and weaknesses of student data collection will be presented.

Keywords: place-based, student data collection, sustainability, water quality monitoring

Procedia PDF Downloads 123
25811 A Query Optimization Strategy for Autonomous Distributed Database Systems

Authors: Dina K. Badawy, Dina M. Ibrahim, Alsayed A. Sallam

Abstract:

Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results.

Keywords: autonomous strategies, distributed database systems, high priority, query optimization

Procedia PDF Downloads 490
25810 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: data mining, data science, trajectory, animal behavior

Procedia PDF Downloads 111
25809 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 33
25808 Focus-Latent Dirichlet Allocation for Aspect-Level Opinion Mining

Authors: Mohsen Farhadloo, Majid Farhadloo

Abstract:

Aspect-level opinion mining that aims at discovering aspects (aspect identification) and their corresponding ratings (sentiment identification) from customer reviews have increasingly attracted attention of researchers and practitioners as it provides valuable insights about products/services from customer's points of view. Instead of addressing aspect identification and sentiment identification in two separate steps, it is possible to simultaneously identify both aspects and sentiments. In recent years many graphical models based on Latent Dirichlet Allocation (LDA) have been proposed to solve both aspect and sentiment identifications in a single step. Although LDA models have been effective tools for the statistical analysis of document collections, they also have shortcomings in addressing some unique characteristics of opinion mining. Our goal in this paper is to address one of the limitations of topic models to date; that is, they fail to directly model the associations among topics. Indeed in many text corpora, it is natural to expect that subsets of the latent topics have higher probabilities. We propose a probabilistic graphical model called focus-LDA, to better capture the associations among topics when applied to aspect-level opinion mining. Our experiments on real-life data sets demonstrate the improved effectiveness of the focus-LDA model in terms of the accuracy of the predictive distributions over held out documents. Furthermore, we demonstrate qualitatively that the focus-LDA topic model provides a natural way of visualizing and exploring unstructured collection of textual data.

Keywords: aspect-level opinion mining, document modeling, Latent Dirichlet Allocation, LDA, sentiment analysis

Procedia PDF Downloads 69
25807 Application of Data Driven Based Models as Early Warning Tools of High Stream Flow Events and Floods

Authors: Mohammed Seyam, Faridah Othman, Ahmed El-Shafie

Abstract:

The early warning of high stream flow events (HSF) and floods is an important aspect in the management of surface water and rivers systems. This process can be performed using either process-based models or data driven-based models such as artificial intelligence (AI) techniques. The main goal of this study is to develop efficient AI-based model for predicting the real-time hourly stream flow (Q) and apply it as early warning tool of HSF and floods in the downstream area of the Selangor River basin, taken here as a paradigm of humid tropical rivers in Southeast Asia. The performance of AI-based models has been improved through the integration of the lag time (Lt) estimation in the modelling process. A total of 8753 patterns of Q, water level, and rainfall hourly records representing one-year period (2011) were utilized in the modelling process. Six hydrological scenarios have been arranged through hypothetical cases of input variables to investigate how the changes in RF intensity in upstream stations can lead formation of floods. The initial SF was changed for each scenario in order to include wide range of hydrological situations in this study. The performance evaluation of the developed AI-based model shows that high correlation coefficient (R) between the observed and predicted Q is achieved. The AI-based model has been successfully employed in early warning throughout the advance detection of the hydrological conditions that could lead to formations of floods and HSF, where represented by three levels of severity (i.e., alert, warning, and danger). Based on the results of the scenarios, reaching the danger level in the downstream area required high RF intensity in at least two upstream areas. According to results of applications, it can be concluded that AI-based models are beneficial tools to the local authorities for flood control and awareness.

Keywords: floods, stream flow, hydrological modelling, hydrology, artificial intelligence

Procedia PDF Downloads 217
25806 Solubility of Water in CO2 Mixtures at Pipeline Operation Conditions

Authors: Mohammad Ahmad, Sander Gersen, Erwin Wilbers

Abstract:

Carbon capture, transport and underground storage have become a major solution to reduce CO2 emissions from power plants and other large CO2 sources. A big part of this captured CO2 stream is transported at high pressure dense phase conditions and stored in offshore underground depleted oil and gas fields. CO2 is also transported in offshore pipelines to be used for enhanced oil and gas recovery. The captured CO2 stream with impurities may contain water that causes severe corrosion problems, flow assurance failure and might damage valves and instrumentations. Thus, free water formation should be strictly prevented. The purpose of this work is to study the solubility of water in pure CO2 and in CO2 mixtures under real pipeline pressure (90-150 bar) and temperature operation conditions (5-35°C). A set up was constructed to generate experimental data. The results show the solubility of water in CO2 mixtures increasing with the increase of the temperature or/and with the increase in pressure. A drop in water solubility in CO2 is observed in the presence of impurities. The data generated were then used to assess the capabilities of two mixture models: the GERG-2008 model and the EOS-CG model. By generating the solubility data, this study contributes to determine the maximum allowable water content in CO2 pipelines.

Keywords: carbon capture and storage, water solubility, equation of states, fluids engineering

Procedia PDF Downloads 264
25805 Heavy Minerals Distribution in the Recent Stream Sediments of Diyala River Basin, Northeastern Iraq

Authors: Abbas R. Ali, Daroon Hasan Khorsheed

Abstract:

Twenty one samples of stream sediments were collected from the Diyala River Basin (DRB), which represent one of three major tributaries of the Tigris River at northeastern Iraq. This study is concerned with the heavy minerals (HM) analysis in the + 63μ m fraction of the Diyala River sediments, distribution pattern in the various river basin sectors, as well as comparing the present results with previous works.The metastable heavy minerals (epidote, staurolite, garnet) represent more than (30%) Whereas the ultrastable heavy minerals (pyroxene and amphibole) make only about (19 %). Opaques are present in high proportions reaching about (29%) as an average. The ultrastable (zircon, tourmaline, rutile) heavy minerals are the miner constituents (7%) in the sediments.According to the laboratory analytical data of heavy mineral distributions the studied sediments are derived from mafic and ultramafic rocks are found in northeastern Iraq that represent Walash – Nawpordan Series and Mawat complexes in Zagros zones. The presence of zircon and tourmaline in trace amounts may give an indication for the weak role of acidic rocks in the source area whereas the epidote group minerals give an indication for the role of metamorphic rocks.

Keywords: heavy minerals, mineral distribution, recent stream sediment, Diyala river, northeastern Iraq

Procedia PDF Downloads 482
25804 Application of Granular Computing Paradigm in Knowledge Induction

Authors: Iftikhar U. Sikder

Abstract:

This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.

Keywords: concept approximation, granular computing, reducts, rough set theory, rule induction

Procedia PDF Downloads 492
25803 Hydrogeological Appraisal of Karacahisar Coal Field (Western Turkey): Impacts of Mining on Groundwater Resources Utilized for Water Supply

Authors: Sukran Acikel, Mehmet Ekmekci, Otgonbayar Namkhai

Abstract:

Lignite coal fields in western Turkey generally occurs in tensional Neogene basins bordered by major faults. Karacahisar coal field in Mugla province of western Turkey is a large Neogene basin filled with alternation of silisic and calcerous layers. The basement of the basin is composed of mainly karstified carbonate rocks of Mesozoic and schists of Paleozoic age. The basement rocks are exposed at highlands surrounding the basin. The basin fill deposits forms shallow, low yield and local aquifers whereas karstic carbonate rock masses forms the major aquifer in the region. The karstic aquifer discharges through a spring zone issuing at intersection of two major faults. Municipal water demand in Bodrum city, a touristic attraction area is almost totally supplied by boreholes tapping the karstic aquifer. A well field has been constructed on the eastern edge of the coal basin, which forms a ridge separating two Neogene basins. A major concern was raised about the plausible impact of mining activities on groundwater system in general and on water supply well field in particular. The hydrogeological studies carried out in the area revealed that the coal seam is located below the groundwater level. Mining operations will be affected by groundwater inflow to the pits, which will require dewatering measures. Dewatering activities in mine sites have two-sided effects: a) lowers the groundwater level at and around the pit for a safe and effective mining operation, b) continuous dewatering causes expansion of cone of depression to reach a spring, stream and/or well being utilized by local people, capturing their water. Plausible effect of mining operations on the flow of the spring zone was another issue of concern. Therefore, a detailed representative hydrogeological conceptual model of the site was developed on the basis of available data and field work. According to the hydrogeological conceptual model, dewatering of Neogene layers will not hydraulically affect the water supply wells, however, the ultimate perimeter of the open pit will expand to intersect the well field. According to the conceptual model, the coal seam is separated from the bottom by a thick impervious clay layer sitting on the carbonate basement. Therefore, the hydrostratigraphy does not allow a hydraulic interaction between the mine pit and the karstic carbonate rock aquifer. However, the structural setting in the basin suggests that deep faults intersecting the basement and the Neogene sequence will most probably carry the deep groundwater up to a level above the bottom of the pit. This will require taking necessary measure to lower the piezometric level of the carbonate rock aquifer along the faults. Dewatering the carbonate rock aquifer will reduce the flow to the spring zone. All findings were put together to recommend a strategy for safe and effective mining operation.

Keywords: conceptual model, dewatering, groundwater, mining operation

Procedia PDF Downloads 366
25802 Presenting a Model for Predicting the State of Being Accident-Prone of Passages According to Neural Network and Spatial Data Analysis

Authors: Hamd Rezaeifar, Hamid Reza Sahriari

Abstract:

Accidents are considered to be one of the challenges of modern life. Due to the fact that the victims of this problem and also internal transportations are getting increased day by day in Iran, studying effective factors of accidents and identifying suitable models and parameters about this issue are absolutely essential. The main purpose of this research has been studying the factors and spatial data affecting accidents of Mashhad during 2007- 2008. In this paper it has been attempted to – through matching spatial layers on each other and finally by elaborating them with the place of accident – at the first step by adding landmarks of the accident and through adding especial fields regarding the existence or non-existence of effective phenomenon on accident, existing information banks of the accidents be completed and in the next step by means of data mining tools and analyzing by neural network, the relationship between these data be evaluated and a logical model be designed for predicting accident-prone spots with minimum error. The model of this article has a very accurate prediction in low-accident spots; yet it has more errors in accident-prone regions due to lack of primary data.

Keywords: accident, data mining, neural network, GIS

Procedia PDF Downloads 8
25801 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 129
25800 Condition Monitoring of Railway Earthworks using Distributed Rayleigh Sensing

Authors: Andrew Hall, Paul Clarkson

Abstract:

Climate change is predicted to increase the number of extreme weather events intensifying the strain on Railway Earthworks. This paper describes the use of Distributed Rayleigh Sensing to monitor low frequency activity on a vulnerable earthworks sectionprone to landslides alongside a railway line in Northern Spain. The vulnerable slope is instrumented with conventional slope stability sensors allowing an assessment to be conducted of the application of Distributed Rayleigh Sensing as an earthwork condition monitoring tool to enhance the resilience of railway networks.

Keywords: condition monitoring, railway earthworks, distributed rayleigh sensing, climate change

Procedia PDF Downloads 166
25799 Measurement of Natural Radioactivity and Health Hazard Index Evaluation in Major Soils of Tin Mining Areas of Perak

Authors: Habila Nuhu

Abstract:

Natural radionuclides in the environment can significantly contribute to human exposure to ionizing radiation. The knowledge of their levels in an environment can help the radiological protection agencies in policymaking. Measurement of natural radioactivity in major soils in the tin mining state of Perak Malaysia has been conducted using an HPGe detector. Seventy (70) soil samples were collected at widely distributed locations in the state. Six major soil types were sampled, and thirteen districts around the state were covered. The following were the results of the 226Ra (238U), 228Ra (232Th), and 40K activity in the soil samples: 226Ra (238U) has a mean activity concentration of 191.83 Bq kg⁻¹, more than five times the UNSCEAR reference limits of 35 Bq kg⁻¹. The mean activity concentration of 228Ra (232Th) with a value of 232.41 Bq kg⁻¹ is over seven times the UNSCEAR reference values of 30 Bq kg⁻¹. The average concentration of 40K activity was 275.24 Bq kg⁻¹, which was less than the UNSCEAR reference limit of 400 Bq Kg⁻¹. The range of external hazards index (Hₑₓ) values was from 1.03 to 2.05, while the internal hazards index (Hin) was from 1.48 to 3.08. The Hex and Hin should be less than one for minimal external and internal radiation threats as well as secure use of soil material for building construction. The Hₑₓ and Hin results generally indicate that while using the soil types and their derivatives as building materials in the study area, care must be taken.

Keywords: activity concentration, hazard index, soil samples, tin mining

Procedia PDF Downloads 75