Search results for: data stream mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24926

Search results for: data stream mining

24716 Implementation of Lean Tools (Value Stream Mapping and ECRS) in an Oil Refinery

Authors: Ronita Singh, Yaman Pattanaik, Soham Lalwala

Abstract:

In today’s highly competitive business environment, every organization is striving towards lean manufacturing systems to achieve lower Production Lead Times, lower costs, less inventory and overall improvement in supply chains efficiency. Based on the similar idea, this paper presents the practical application of Value Stream Mapping (VSM) tool and ECRS (Eliminate, Combine, Reduce, and Simplify) technique in the receipt section of the material management center of an oil refinery. A value stream is an assortment of all actions (value added as well as non-value added) that are required to bring a product through the essential flows, starting with raw material and ending with the customer. For drawing current state value stream mapping, all relevant data of the receipt cycle has been collected and analyzed. Then analysis of current state map has been done for determining the type and quantum of waste at every stage which helped in ascertaining as to how far the warehouse is from the concept of lean manufacturing. From the results achieved by current VSM, it was observed that the two processes- Preparation of GRN (Goods Receipt Number) and Preparation of UD (Usage Decision) are both bottle neck operations and have higher cycle time. This root cause analysis of various types of waste helped in designing a strategy for step-wise implementation of lean tools. The future state thus created a lean flow of materials at the warehouse center, reducing the lead time of the receipt cycle from 11 days to 7 days and increasing overall efficiency by 27.27%.

Keywords: current VSM, ECRS, future VSM, receipt cycle, supply chain, VSM

Procedia PDF Downloads 263
24715 Managing Data from One Hundred Thousand Internet of Things Devices Globally for Mining Insights

Authors: Julian Wise

Abstract:

Newcrest Mining is one of the world’s top five gold and rare earth mining organizations by production, reserves and market capitalization in the world. This paper elaborates on the data acquisition processes employed by Newcrest in collaboration with Fortune 500 listed organization, Insight Enterprises, to standardize machine learning solutions which process data from over a hundred thousand distributed Internet of Things (IoT) devices located at mine sites globally. Through the utilization of software architecture cloud technologies and edge computing, the technological developments enable for standardized processes of machine learning applications to influence the strategic optimization of mineral processing. Target objectives of the machine learning optimizations include time savings on mineral processing, production efficiencies, risk identification, and increased production throughput. The data acquired and utilized for predictive modelling is processed through edge computing by resources collectively stored within a data lake. Being involved in the digital transformation has necessitated the standardization software architecture to manage the machine learning models submitted by vendors, to ensure effective automation and continuous improvements to the mineral process models. Operating at scale, the system processes hundreds of gigabytes of data per day from distributed mine sites across the globe, for the purposes of increased improved worker safety, and production efficiency through big data applications.

Keywords: mineral technology, big data, machine learning operations, data lake

Procedia PDF Downloads 83
24714 Impacts of Land Use and Land Cover Change on Stream Flow and Sediment Yield of Genale Dawa Dam III Watershed, Ethiopia

Authors: Aklilu Getahun Sulito

Abstract:

Land Use and Land Cover change dynamics is a result of complex interactions betweenseveral bio- physical and socio-economic conditions. The impacts of the landcoverchange on stream flow and sediment yield were analyzed statistically usingthehydrological model, SWAT. Genale Dawa Dam III watershed is highly af ectedbydeforestation, over grazing, and agricultural land expansion. This study was aimedusingSWAT model for the assessment of impacts of land use land cover change on sediment yield, evaluating stream flow on wet &dry seasons and spatial distribution sediment yieldfrom sub-basins of the Genale Dawa Dam III watershed. Land use land cover maps(LULC) of 2000, 2008 and 2016 were used with same corresponding climate data. During the study period most parts of the forest, dense forest evergreen and grass landchanged to cultivated land. The cultivated land increased by 26.2%but forest land, forest evergreen lands and grass lands decreased by 21.33%, 11.59 % and 7.28 %respectively, following that the mean annual sediment yield of watershed increased by 7.37ton/haover16 years period (2000 – 2016). The analysis of stream flow for wet and dry seasonsshowed that the steam flow increased by 25.5% during wet season, but decreasedby29.6% in the dry season. The result an average annual spatial distribution of sediment yield increased by 7.73ton/ha yr -1 from (2000_2016). The calibration results for bothstream flow and sediment yield showed good agreement between observed and simulateddata with the coef icient of determination of 0.87 and 0.84, Nash-Sutclif e ef iciencyequality to 0.83 and 0.78 and percentage bias of -7.39% and -10.90%respectively. Andthe result for validation for both stream flow and sediment showed good result withCoef icient of determination equality to 0.83 and 0.80, Nash-Sutclif e ef iciency of 0.78and 0.75 and percentage bias of 7.09% and 3.95%. The result obtained fromthe model based on the above method was the mean annual sediment load at Genale DawaDamIIIwatershed increase from 2000 to 2016 for the reason that of the land uses change. Sotouse the Genale Dawa Dam III the land use management practices are neededinthefuture to prevent further increase of sediment yield of the watershed.

Keywords: Genale Dawa Dam III watershed, land use land cover change, SWAT, spatial distribution, sediment yield, stream flow

Procedia PDF Downloads 16
24713 Application of Knowledge Discovery in Database Techniques in Cost Overruns of Construction Projects

Authors: Mai Ghazal, Ahmed Hammad

Abstract:

Cost overruns in construction projects are considered as worldwide challenges since the cost performance is one of the main measures of success along with schedule performance. To overcome this problem, studies were conducted to investigate the cost overruns' factors, also projects' historical data were analyzed to extract new and useful knowledge from it. This research is studying and analyzing the effect of some factors causing cost overruns using the historical data from completed construction projects. Then, using these factors to estimate the probability of cost overrun occurrence and predict its percentage for future projects. First, an intensive literature review was done to study all the factors that cause cost overrun in construction projects, then another review was done for previous researcher papers about mining process in dealing with cost overruns. Second, a proposed data warehouse was structured which can be used by organizations to store their future data in a well-organized way so it can be easily analyzed later. Third twelve quantitative factors which their data are frequently available at construction projects were selected to be the analyzed factors and suggested predictors for the proposed model.

Keywords: construction management, construction projects, cost overrun, cost performance, data mining, data warehousing, knowledge discovery, knowledge management

Procedia PDF Downloads 332
24712 Intelligent Process Data Mining for Monitoring for Fault-Free Operation of Industrial Processes

Authors: Hyun-Woo Cho

Abstract:

The real-time fault monitoring and diagnosis of large scale production processes is helpful and necessary in order to operate industrial process safely and efficiently producing good final product quality. Unusual and abnormal events of the process may have a serious impact on the process such as malfunctions or breakdowns. This work try to utilize process measurement data obtained in an on-line basis for the safe and some fault-free operation of industrial processes. To this end, this work evaluated the proposed intelligent process data monitoring framework based on a simulation process. The monitoring scheme extracts the fault pattern in the reduced space for the reliable data representation. Moreover, this work shows the results of using linear and nonlinear techniques for the monitoring purpose. It has shown that the nonlinear technique produced more reliable monitoring results and outperforms linear methods. The adoption of the qualitative monitoring model helps to reduce the sensitivity of the fault pattern to noise.

Keywords: process data, data mining, process operation, real-time monitoring

Procedia PDF Downloads 605
24711 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 379
24710 Mood Recognition Using Indian Music

Authors: Vishwa Joshi

Abstract:

The study of mood recognition in the field of music has gained a lot of momentum in the recent years with machine learning and data mining techniques and many audio features contributing considerably to analyze and identify the relation of mood plus music. In this paper we consider the same idea forward and come up with making an effort to build a system for automatic recognition of mood underlying the audio song’s clips by mining their audio features and have evaluated several data classification algorithms in order to learn, train and test the model describing the moods of these audio songs and developed an open source framework. Before classification, Preprocessing and Feature Extraction phase is necessary for removing noise and gathering features respectively.

Keywords: music, mood, features, classification

Procedia PDF Downloads 472
24709 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 203
24708 Social Media Mining with R. Twitter Analyses

Authors: Diana Codat

Abstract:

Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.

Keywords: data mining, language R, social networks, Twitter

Procedia PDF Downloads 149
24707 Improving Grade Control Turnaround Times with In-Pit Hyperspectral Assaying

Authors: Gary Pattemore, Michael Edgar, Andrew Job, Marina Auad, Kathryn Job

Abstract:

As critical commodities become more scarce, significant time and resources have been used to better understand complicated ore bodies and extract their full potential. These challenging ore bodies provide several pain points for geologists and engineers to overcome, poor handling of these issues flows downs stream to the processing plant affecting throughput rates and recovery. Many open cut mines utilise blast hole drilling to extract additional information to feed back into the modelling process. This method requires samples to be collected during or after blast hole drilling. Samples are then sent for assay with turnaround times varying from 1 to 12 days. This method is time consuming, costly, requires human exposure on the bench and collects elemental data only. To address this challenge, research has been undertaken to utilise hyperspectral imaging across a broad spectrum to scan samples, collars or take down hole measurements for minerals and moisture content and grade abundances. Automation of this process using unmanned vehicles and on-board processing reduces human in pit exposure to ensure ongoing safety. On-board processing allows data to be integrated into modelling workflows with immediacy. The preliminary results demonstrate numerous direct and indirect benefits from this new technology, including rapid and accurate grade estimates, moisture content and mineralogy. These benefits allow for faster geo modelling updates, better informed mine scheduling and improved downstream blending and processing practices. The paper presents recommendations for implementation of the technology in open cut mining environments.

Keywords: grade control, hyperspectral scanning, artificial intelligence, autonomous mining, machine learning

Procedia PDF Downloads 78
24706 Improved FP-Growth Algorithm with Multiple Minimum Supports Using Maximum Constraints

Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam

Abstract:

Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FP-growth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.

Keywords: association rules, FP-growth, multiple minimum supports, Weka tool

Procedia PDF Downloads 447
24705 Analysis of Changes Being Done of the Mine Legislation of Turkey: Mining Operation Activity Process

Authors: Taşkın Deniz Yıldız, Mustafa Topaloğlu, Orhan Kural

Abstract:

The right to operate a fairly long periods of prior periods and after the 3213 Mining Law has been observed to be shortened in Turkey. Permit the realization of business activities (or concession) requested the purchase of the mine operated "found mine" position, as well as the financial and technical capability to have the owner of the right to operate the mines as well as the principle of equality is important in terms of assessing the best way be. In particular, in this context, license fields "negligence" (downsizing) have noted that the current arrangement for all periods. However, in the period after 3213 Mining Act and a permit to operate more effectively within the framework of implementation of negligence is laid down.

Keywords: mining legislation, operation, permit, Turkey

Procedia PDF Downloads 375
24704 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 39
24703 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 400
24702 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 445
24701 Ground Water Pollution Investigation around Çorum Stream Basin in Turkey

Authors: Halil Bas, Unal Demiray, Sukru Dursun

Abstract:

Water and ground water pollution at the most of the countries is important problem. Investigation of water pollution source must be carried out to save fresh water. Because fresh water sources are very limited and recent sources are not enough for increasing population of world. In this study, investigation was carried out on pollution factors effecting the quality of the groundwater in Çorum Stream Basin in Turkey. Effect of geological structure of the region and the interaction between the stream and groundwater was researched. For the investigation, stream and groundwater sampling were performed at rainy and dry seasons to see if there is a change on quality parameters. The results were evaluated by the computer programs and then graphics, distribution maps were prepared. Thus, degree of the quality and pollution were tried to understand. According to analysis results, because the results of streams and the ground waters are not so close to each other we can say that there is no interaction between the stream and the groundwater. As the irrigation water, the stream waters are generally in the range between C3S1 region and the ground waters are generally in the range between C3S1 and C4S2 regions according to US Salinity Laboratory Diagram. According to Wilcox diagram stream waters are generally good-permissible and ground waters are generally good permissible, doubtful to unsuitable and unsuitable type. Especially ground waters are doubtful to unsuitable and unsuitable types in dry season. It may be assumed that as the result of relative increase in concentration of salt minerals. Especially samples from groundwater wells bored close to gypsium bearing units have high hardness, electrical conductivity and salinity values. Thus for drinking and irrigation these waters are determined as unsuitable. As a result of these studies, it is understood that the groundwater especially was effected by the lithological contamination rather than the anthropogenic or the other types of pollution. Because the alluvium is covered by the silt and clay lithology it is not affected by the anthropogenic and the other foreign factors. The results of solid waste disposal site leachate indicate that this site would have a risk potential for pollution in the future. Although the parameters did not exceed the maximum dangerous values it does not mean that they will not be dangerous in the future, and this case must be taken into account.

Keywords: Çorum, environment, groundwater, hydrogeology, geology, pollution, quality, stream

Procedia PDF Downloads 469
24700 Leveraging Power BI for Advanced Geotechnical Data Analysis and Visualization in Mining Projects

Authors: Elaheh Talebi, Fariba Yavari, Lucy Philip, Lesley Town

Abstract:

The mining industry generates vast amounts of data, necessitating robust data management systems and advanced analytics tools to achieve better decision-making processes in the development of mining production and maintaining safety. This paper highlights the advantages of Power BI, a powerful intelligence tool, over traditional Excel-based approaches for effectively managing and harnessing mining data. Power BI enables professionals to connect and integrate multiple data sources, ensuring real-time access to up-to-date information. Its interactive visualizations and dashboards offer an intuitive interface for exploring and analyzing geotechnical data. Advanced analytics is a collection of data analysis techniques to improve decision-making. Leveraging some of the most complex techniques in data science, advanced analytics is used to do everything from detecting data errors and ensuring data accuracy to directing the development of future project phases. However, while Power BI is a robust tool, specific visualizations required by geotechnical engineers may have limitations. This paper studies the capability to use Python or R programming within the Power BI dashboard to enable advanced analytics, additional functionalities, and customized visualizations. This dashboard provides comprehensive tools for analyzing and visualizing key geotechnical data metrics, including spatial representation on maps, field and lab test results, and subsurface rock and soil characteristics. Advanced visualizations like borehole logs and Stereonet were implemented using Python programming within the Power BI dashboard, enhancing the understanding and communication of geotechnical information. Moreover, the dashboard's flexibility allows for the incorporation of additional data and visualizations based on the project scope and available data, such as pit design, rock fall analyses, rock mass characterization, and drone data. This further enhances the dashboard's usefulness in future projects, including operation, development, closure, and rehabilitation phases. Additionally, this helps in minimizing the necessity of utilizing multiple software programs in projects. This geotechnical dashboard in Power BI serves as a user-friendly solution for analyzing, visualizing, and communicating both new and historical geotechnical data, aiding in informed decision-making and efficient project management throughout various project stages. Its ability to generate dynamic reports and share them with clients in a collaborative manner further enhances decision-making processes and facilitates effective communication within geotechnical projects in the mining industry.

Keywords: geotechnical data analysis, power BI, visualization, decision-making, mining industry

Procedia PDF Downloads 52
24699 An Analysis on Gravel of Sand-Gravel Bar at Gneiss or Granite Area of the Upper Hongcheon River in South Korea

Authors: Man Kyu Kim, Hansu Shin

Abstract:

This study is an analysis on gravel of sand-gravel bar that stretches variously in the Duchon and Naechon stream basins, which are situated on Hong-Cheon River (a well-developed sand-gravel bar in upstream river) basins in Korea. Naechon stream mostly flows through granite zone but Duchon stream mostly flows through gneiss zone. The characteristics of gravel in the sand-gravel bar of these two branches in the upper Hongcheon River were analyzed in this study in order to understand the geomorphic development of streams depending on the differences of bedrock. Through the analysis on the roundness and flatness of gravel, we figured out an irregular trend following the increase in supply of granite gravel and gneiss gravel as we traveled downstream. The result shows that the two basins have uppermost small basin condition reflecting the mountain valley environment although it may be difficult to do an equivalent comparison to other roundness researches in Korea or in Europe. This study conducted an analysis on gravels found in small scale streams unlike the previous studies trend which mostly studies large rivers. The research provides an opportunity to offer basic data for continuous comparison research on various small basins.

Keywords: flatness, geology, roundness, sand-gravel bar

Procedia PDF Downloads 336
24698 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: data mining, k-means, road traffic accidents, Waze, Weka

Procedia PDF Downloads 376
24697 Hydraulic Characteristics of the Tidal River Dongcheon in Busan City

Authors: Young Man Cho, Sang Hyun Kim

Abstract:

Even though various management practices such as sediment dredging were attempted to improve water quality of Dongcheon located in Busan, the environmental condition of this stream was deteriorated. Therefore, Busan metropolitan city had pumped and diverted sea water to upstream of Dongcheon for several years. This study explored hydraulic characteristics of Dongcheon to configure the best management practice for ecological restoration and water quality improvement of a man-made urban stream. Intensive field investigation indicates that average flow velocities at depths of 20% and 80% from the water surface ranged 5 to 10 cm/s and 2 to 5 cm/s, respectively. Concentrations of dissolved oxygen for all depths were less than 0.25 mg/l during low tidal period. Even though density difference can be found along stream depth, density current seems rarely generated in Dongcheon. Short period of high tidal portion and shallow depths are responsible for well-mixing nature of Doncheon.

Keywords: hydraulic, tidal river, density current, sea water

Procedia PDF Downloads 189
24696 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 107
24695 The Results of Longitudinal Water Quality Monitoring of the Brandywine River, Chester County, Pennsylvania by High School Students

Authors: Dina L. DiSantis

Abstract:

Strengthening a sense of responsibility while relating global sustainability concepts such as water quality and pollution to a local water system can be achieved by teaching students to conduct and interpret water quality monitoring tests. When students conduct their own research, they become better stewards of the environment. Providing outdoor learning and place-based opportunities for students helps connect them to the natural world. By conducting stream studies and collecting data, students are able to better understand how the natural environment is a place where everything is connected. Students have been collecting physical, chemical and biological data along the West and East Branches of the Brandywine River, in Pennsylvania for over ten years. The stream studies are part of the advanced placement environmental science and aquatic science courses that are offered as electives to juniors and seniors at the Downingtown High School West Campus in Downingtown, Pennsylvania. Physical data collected includes: temperature, turbidity, width, depth, velocity, and volume of flow or discharge. The chemical tests conducted are: dissolved oxygen, carbon dioxide, pH, nitrates, alkalinity and phosphates. Macroinvertebrates are collected with a kick net, identified and then released. Students collect the data from several locations while traveling by canoe. In the classroom, students prepare a water quality data analysis and interpretation report based on their collected data. The summary of the results from longitudinal water quality data collection by students, as well as the strengths and weaknesses of student data collection will be presented.

Keywords: place-based, student data collection, sustainability, water quality monitoring

Procedia PDF Downloads 125
24694 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: data mining, data science, trajectory, animal behavior

Procedia PDF Downloads 112
24693 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 34
24692 Focus-Latent Dirichlet Allocation for Aspect-Level Opinion Mining

Authors: Mohsen Farhadloo, Majid Farhadloo

Abstract:

Aspect-level opinion mining that aims at discovering aspects (aspect identification) and their corresponding ratings (sentiment identification) from customer reviews have increasingly attracted attention of researchers and practitioners as it provides valuable insights about products/services from customer's points of view. Instead of addressing aspect identification and sentiment identification in two separate steps, it is possible to simultaneously identify both aspects and sentiments. In recent years many graphical models based on Latent Dirichlet Allocation (LDA) have been proposed to solve both aspect and sentiment identifications in a single step. Although LDA models have been effective tools for the statistical analysis of document collections, they also have shortcomings in addressing some unique characteristics of opinion mining. Our goal in this paper is to address one of the limitations of topic models to date; that is, they fail to directly model the associations among topics. Indeed in many text corpora, it is natural to expect that subsets of the latent topics have higher probabilities. We propose a probabilistic graphical model called focus-LDA, to better capture the associations among topics when applied to aspect-level opinion mining. Our experiments on real-life data sets demonstrate the improved effectiveness of the focus-LDA model in terms of the accuracy of the predictive distributions over held out documents. Furthermore, we demonstrate qualitatively that the focus-LDA topic model provides a natural way of visualizing and exploring unstructured collection of textual data.

Keywords: aspect-level opinion mining, document modeling, Latent Dirichlet Allocation, LDA, sentiment analysis

Procedia PDF Downloads 72
24691 Application of Data Driven Based Models as Early Warning Tools of High Stream Flow Events and Floods

Authors: Mohammed Seyam, Faridah Othman, Ahmed El-Shafie

Abstract:

The early warning of high stream flow events (HSF) and floods is an important aspect in the management of surface water and rivers systems. This process can be performed using either process-based models or data driven-based models such as artificial intelligence (AI) techniques. The main goal of this study is to develop efficient AI-based model for predicting the real-time hourly stream flow (Q) and apply it as early warning tool of HSF and floods in the downstream area of the Selangor River basin, taken here as a paradigm of humid tropical rivers in Southeast Asia. The performance of AI-based models has been improved through the integration of the lag time (Lt) estimation in the modelling process. A total of 8753 patterns of Q, water level, and rainfall hourly records representing one-year period (2011) were utilized in the modelling process. Six hydrological scenarios have been arranged through hypothetical cases of input variables to investigate how the changes in RF intensity in upstream stations can lead formation of floods. The initial SF was changed for each scenario in order to include wide range of hydrological situations in this study. The performance evaluation of the developed AI-based model shows that high correlation coefficient (R) between the observed and predicted Q is achieved. The AI-based model has been successfully employed in early warning throughout the advance detection of the hydrological conditions that could lead to formations of floods and HSF, where represented by three levels of severity (i.e., alert, warning, and danger). Based on the results of the scenarios, reaching the danger level in the downstream area required high RF intensity in at least two upstream areas. According to results of applications, it can be concluded that AI-based models are beneficial tools to the local authorities for flood control and awareness.

Keywords: floods, stream flow, hydrological modelling, hydrology, artificial intelligence

Procedia PDF Downloads 218
24690 Solubility of Water in CO2 Mixtures at Pipeline Operation Conditions

Authors: Mohammad Ahmad, Sander Gersen, Erwin Wilbers

Abstract:

Carbon capture, transport and underground storage have become a major solution to reduce CO2 emissions from power plants and other large CO2 sources. A big part of this captured CO2 stream is transported at high pressure dense phase conditions and stored in offshore underground depleted oil and gas fields. CO2 is also transported in offshore pipelines to be used for enhanced oil and gas recovery. The captured CO2 stream with impurities may contain water that causes severe corrosion problems, flow assurance failure and might damage valves and instrumentations. Thus, free water formation should be strictly prevented. The purpose of this work is to study the solubility of water in pure CO2 and in CO2 mixtures under real pipeline pressure (90-150 bar) and temperature operation conditions (5-35°C). A set up was constructed to generate experimental data. The results show the solubility of water in CO2 mixtures increasing with the increase of the temperature or/and with the increase in pressure. A drop in water solubility in CO2 is observed in the presence of impurities. The data generated were then used to assess the capabilities of two mixture models: the GERG-2008 model and the EOS-CG model. By generating the solubility data, this study contributes to determine the maximum allowable water content in CO2 pipelines.

Keywords: carbon capture and storage, water solubility, equation of states, fluids engineering

Procedia PDF Downloads 265
24689 Heavy Minerals Distribution in the Recent Stream Sediments of Diyala River Basin, Northeastern Iraq

Authors: Abbas R. Ali, Daroon Hasan Khorsheed

Abstract:

Twenty one samples of stream sediments were collected from the Diyala River Basin (DRB), which represent one of three major tributaries of the Tigris River at northeastern Iraq. This study is concerned with the heavy minerals (HM) analysis in the + 63μ m fraction of the Diyala River sediments, distribution pattern in the various river basin sectors, as well as comparing the present results with previous works.The metastable heavy minerals (epidote, staurolite, garnet) represent more than (30%) Whereas the ultrastable heavy minerals (pyroxene and amphibole) make only about (19 %). Opaques are present in high proportions reaching about (29%) as an average. The ultrastable (zircon, tourmaline, rutile) heavy minerals are the miner constituents (7%) in the sediments.According to the laboratory analytical data of heavy mineral distributions the studied sediments are derived from mafic and ultramafic rocks are found in northeastern Iraq that represent Walash – Nawpordan Series and Mawat complexes in Zagros zones. The presence of zircon and tourmaline in trace amounts may give an indication for the weak role of acidic rocks in the source area whereas the epidote group minerals give an indication for the role of metamorphic rocks.

Keywords: heavy minerals, mineral distribution, recent stream sediment, Diyala river, northeastern Iraq

Procedia PDF Downloads 486
24688 Application of Granular Computing Paradigm in Knowledge Induction

Authors: Iftikhar U. Sikder

Abstract:

This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.

Keywords: concept approximation, granular computing, reducts, rough set theory, rule induction

Procedia PDF Downloads 493
24687 Hydrogeological Appraisal of Karacahisar Coal Field (Western Turkey): Impacts of Mining on Groundwater Resources Utilized for Water Supply

Authors: Sukran Acikel, Mehmet Ekmekci, Otgonbayar Namkhai

Abstract:

Lignite coal fields in western Turkey generally occurs in tensional Neogene basins bordered by major faults. Karacahisar coal field in Mugla province of western Turkey is a large Neogene basin filled with alternation of silisic and calcerous layers. The basement of the basin is composed of mainly karstified carbonate rocks of Mesozoic and schists of Paleozoic age. The basement rocks are exposed at highlands surrounding the basin. The basin fill deposits forms shallow, low yield and local aquifers whereas karstic carbonate rock masses forms the major aquifer in the region. The karstic aquifer discharges through a spring zone issuing at intersection of two major faults. Municipal water demand in Bodrum city, a touristic attraction area is almost totally supplied by boreholes tapping the karstic aquifer. A well field has been constructed on the eastern edge of the coal basin, which forms a ridge separating two Neogene basins. A major concern was raised about the plausible impact of mining activities on groundwater system in general and on water supply well field in particular. The hydrogeological studies carried out in the area revealed that the coal seam is located below the groundwater level. Mining operations will be affected by groundwater inflow to the pits, which will require dewatering measures. Dewatering activities in mine sites have two-sided effects: a) lowers the groundwater level at and around the pit for a safe and effective mining operation, b) continuous dewatering causes expansion of cone of depression to reach a spring, stream and/or well being utilized by local people, capturing their water. Plausible effect of mining operations on the flow of the spring zone was another issue of concern. Therefore, a detailed representative hydrogeological conceptual model of the site was developed on the basis of available data and field work. According to the hydrogeological conceptual model, dewatering of Neogene layers will not hydraulically affect the water supply wells, however, the ultimate perimeter of the open pit will expand to intersect the well field. According to the conceptual model, the coal seam is separated from the bottom by a thick impervious clay layer sitting on the carbonate basement. Therefore, the hydrostratigraphy does not allow a hydraulic interaction between the mine pit and the karstic carbonate rock aquifer. However, the structural setting in the basin suggests that deep faults intersecting the basement and the Neogene sequence will most probably carry the deep groundwater up to a level above the bottom of the pit. This will require taking necessary measure to lower the piezometric level of the carbonate rock aquifer along the faults. Dewatering the carbonate rock aquifer will reduce the flow to the spring zone. All findings were put together to recommend a strategy for safe and effective mining operation.

Keywords: conceptual model, dewatering, groundwater, mining operation

Procedia PDF Downloads 368