Search results for: mining wastewater

472 Effects of Upflow Liquid Velocity on Performance of Expanded Granular Sludge Bed (EGSB) System

Authors: Seni Karnchanawong, Wachara Phajee

Abstract:

The effects of upflow liquid velocity (ULV) on performance of expanded granular sludge bed (EGSB) system were investigated. The EGSB reactor, made from galvanized steel pipe 0.10 m diameter and 5 m height, had been used to treat piggery wastewater, after passing through acidification tank. It consisted of 39.3 l working volume in reaction zone and 122 l working volume in sedimentation zone, at the upper part. The reactor was seeded with anaerobically digested sludge and operated at the ULVs of 4, 8, 12 and 16 m/h, consecutively, corresponding to organic loading rates of 9.6 – 13.0 kg COD/ (m3.d). The average COD concentrations in the influent were 9,601 – 13,050 mg/l. The COD removal was not significantly different, i.e. 93.0% - 94.0%, except at ULV 12 m/h where SS in the influent was exceptionally high so that VSS washout had occurred, leading to low COD removal. The FCOD and VFA concentrations in the effluent of all experiments were not much different, indicating the same range of treatment performance. The biogas production decreased at higher ULV and ULV of 4 m/h is suggested as design criterion for EGSB system.

Keywords: Expanded granular sludge bed system, piggery wastewater, upflow liquid velocity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2747

471 BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.

Keywords: Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916

470 Towards Clustering of Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Jürgen Kilian, Andreas Zulauf

Abstract:

Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.

Keywords: Clustering methods, graph-based patterns, graph similarity, hypertext structures, web structure mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468

469 Using Data Mining Techniques for Estimating Minimum, Maximum and Average Daily Temperature Values

Authors: S. Kotsiantis, A. Kostoulas, S. Lykoudis, A. Argiriou, K. Menagias

Abstract:

Estimates of temperature values at a specific time of day, from daytime and daily profiles, are needed for a number of environmental, ecological, agricultural and technical applications, ranging from natural hazards assessments, crop growth forecasting to design of solar energy systems. The scope of this research is to investigate the efficiency of data mining techniques in estimating minimum, maximum and mean temperature values. For this reason, a number of experiments have been conducted with well-known regression algorithms using temperature data from the city of Patras in Greece. The performance of these algorithms has been evaluated using standard statistical indicators, such as Correlation Coefficient, Root Mean Squared Error, etc.

Keywords: regression algorithms, supervised machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3368

468 Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System

Authors: A. Gruzdz, A. Ihnatowicz, J. Siddiqi, B. Akhgar

Abstract:

MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.

Keywords: Bioinformatics, gene expression, ontology, selforganizingmaps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1940

467 Synthesis and Properties of Chitosan-Graft Polyacrylamide/Gelatin Superabsorbent Composites for Wastewater Purification

Authors: H. Ferfera-Harrar, N. Aiouaz, N. Dairi

Abstract:

Superabsorbent polymers received much attention and are used in many fields because of their superior characters to traditional absorbents, e.g., sponge and cotton. So, it is very important but challenging to prepare highly and fast-swelling superabsorbents. A reliable, efficient and low-cost technique for removing heavy metal ions from wastewater is the adsorption using bio-adsorbents obtained from biological materials, such as polysaccharides-based hydrogels superabsorbents. In this study, novel multi-functional superabsorbent composites type semi-interpenetrating polymer networks (Semi-IPNs) were prepared via graft polymerization of acrylamide onto chitosan backbone in presence of gelatin, CTS-g-PAAm/Ge, using potassium persulfate and N,N’-methylene bisacrylamide as initiator and crosslinker, respectively. These hydrogels were also partially hydrolyzed to achieve superabsorbents with ampholytic properties and uppermost swelling capacity. The formation of the grafted network was evidenced by Fourier Transform Infrared Spectroscopy (ATR-FTIR) and Thermogravimetric Analysis (TGA). The porous structures were observed by Scanning Electron Microscope (SEM). From TGA analysis, it was concluded that the incorporation of the Ge in the CTS-g-PAAm network has marginally affected its thermal stability. The effect of gelatin content on the swelling capacities of these superabsorbent composites was examined in various media (distilled water, saline and pH-solutions). The water absorbency was enhanced by adding Ge in the network, where the optimum value was reached at 2 wt. % of Ge. Their hydrolysis has not only greatly optimized their absorption capacity but also improved the swelling kinetic.These materials have also showed reswelling ability. We believe that these super-absorbing materials would be very effective for the adsorption of harmful metal ions from wastewater.

Keywords: Chitosan, gelatin, superabsorbent, water absorbency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2841

466 Generating Concept Trees from Dynamic Self-organizing Map

Authors: Norashikin Ahmad, Damminda Alahakoon

Abstract:

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.

Keywords: dynamic self-organizing map, concept formation, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1419

465 Decision Support System Based on Data Warehouse

Authors: Yang Bao, LuJing Zhang

Abstract:

Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.

Keywords: Decision Support System, Data Warehouse, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3819

464 Hydrogeological Risk and Mining Tunnels: the Fontane-Rodoretto Mine Turin (Italy)

Authors: Paola Gattinoni, Laura Scesi, Elena Cerino Adbin, Daniele Cremonesi

Abstract:

The interaction of tunneling or mining with groundwater has become a very relevant problem not only due to the need to guarantee the safety of workers and to assure the efficiency of the tunnel drainage systems, but also to safeguard water resources from impoverishment and pollution risk. Therefore it is very important to forecast the drainage processes (i.e., the evaluation of drained discharge and drawdown caused by the excavation). The aim of this study was to know better the system and to quantify the flow drained from the Fontane mines, located in Val Germanasca (Turin, Italy). This allowed to understand the hydrogeological local changes in time. The work has therefore been structured as follows: the reconstruction of the conceptual model with the geological, hydrogeological and geological-structural study; the calculation of the tunnel inflows (through the use of structural methods) and the comparison with the measured flow rates; the water balance at the basin scale. In this way it was possible to understand what are the relationships between rainfall, groundwater level variations and the effect of the presence of tunnels as a means of draining water. Subsequently, it the effects produced by the excavation of the mining tunnels was quantified, through numerical modeling. In particular, the modeling made it possible to observe the drawdown variation as a function of number, excavation depth and different mines linings.

Keywords: Groundwater, Italy, numerical model, tunneling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874

463 Utilization of Process Mapping Tool to Enhance Production Drilling in Underground Metal Mining Operations

Authors: Sidharth Talan, Sanjay Kumar Sharma, Eoin Joseph Wallace, Nikita Agrawal

Abstract:

Underground mining is at the core of rapidly evolving metals and minerals sector due to the increasing mineral consumption globally. Even though the surface mines are still more abundant on earth, the scales of industry are slowly tipping towards underground mining due to rising depth and complexities of orebodies. Thus, the efficient and productive functioning of underground operations depends significantly on the synchronized performance of key elements such as operating site, mining equipment, manpower and mine services. Production drilling is the process of conducting long hole drilling for the purpose of charging and blasting these holes for the production of ore in underground metal mines. Thus, production drilling is the crucial segment in the underground metal mining value chain. This paper presents the process mapping tool to evaluate the production drilling process in the underground metal mining operation by dividing the given process into three segments namely Input, Process and Output. The three segments are further segregated into factors and sub-factors. As per the study, the major input factors crucial for the efficient functioning of production drilling process are power, drilling water, geotechnical support of the drilling site, skilled drilling operators, services installation crew, oils and drill accessories for drilling machine, survey markings at drill site, proper housekeeping, regular maintenance of drill machine, suitable transportation for reaching the drilling site and finally proper ventilation. The major outputs for the production drilling process are ore, waste as a result of dilution, timely reporting and investigation of unsafe practices, optimized process time and finally well fragmented blasted material within specifications set by the mining company. The paper also exhibits the drilling loss matrix, which is utilized to appraise the loss in planned production meters per day in a mine on account of availability loss in the machine due to breakdowns, underutilization of the machine and productivity loss in the machine measured in drilling meters per unit of percussion hour with respect to its planned productivity for the day. The given three losses would be essential to detect the bottlenecks in the process map of production drilling operation so as to instigate the action plan to suppress or prevent the causes leading to the operational performance deficiency. The given tool is beneficial to mine management to focus on the critical factors negatively impacting the production drilling operation and design necessary operational and maintenance strategies to mitigate them.

Keywords: Process map, drilling loss matrix, availability, utilization, productivity, percussion rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1035

462 Evaluation of Cigarette Filters Rods as a Biofilm Carrier in Integrated Fixed Film Activated Sludge Process

Authors: A. Sabzali, M. Nikaeen, B. Bina

Abstract:

The purpose of the experiments described in this article was the comparison of integrated fixed film activated sludge (IFAS) and activated sludge (AS) system. The IFAS applied system consists of the cigarette filter rods (wasted filter in tobacco factories) as a biofilm carrier. The comparison with activated sludge was performed by two parallel treatment lines. Organic substance, ammonia and TP removal was investigated over four month period. Synthetic wastewater was prepared with ordinary tap water and glucose as the main sources of carbon and energy, plus balanced macro and micro nutrients. COD removal percentages of 94.55%, and 81.62% were achieved for IFAS and activated sludge system, respectively. Also, ammonia concentration significantly decreased by increasing the HRT in both systems. The average ammonia removal of 97.40 % and 96.34% were achieved for IFAS and activated sludge system, respectively. The removal efficiency of total phosphorus (TP-P) was 60.64%, higher than AS process by 56.63% respectively.

Keywords: Wastewater, biofilm carrier, cigarette filters rods, Activated Sludge, IFAS, nitrification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013

461 Discovering Complex Regularities: from Tree to Semi-Lattice Classifications

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optimize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is able to automatically suggest a strategy to optimize the number of classes optimization, but also support both tree classifications and semi-lattice organizations of the classes to give to the users the possibility of passing from one class to the ones with which it has some aspects in common. Examples of using tree and semi-lattice classifications are given to illustrate advantages and problems. The tool is applied to classify macroeconomic data that report the most developed countries- import and export. It is possible to classify the countries based on their economic behaviour and use the tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation. Possible interrelationships between the classes and their meaning are also discussed.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, Cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1509

460 Influenza Pattern Analysis System through Mining Weblogs

Authors: Pei Lin Khoo, Yunli Lee

Abstract:

Weblogs are resource of social structure to discover and track the various type of information written by blogger. In this paper, we proposed to use mining weblogs technique for identifying the trends of influenza where blogger had disseminated their opinion for the anomaly disease. In order to identify the trends, web crawler is applied to perform a search and generated a list of visited links based on a set of influenza keywords. This information is used to implement the analytics report system for monitoring and analyzing the pattern and trends of influenza (H1N1). Statistical and graphical analysis reports are generated. Both types of the report have shown satisfactory reports that reflect the awareness of Malaysian on the issue of influenza outbreak through blogs.

Keywords: H1N1, Weblogs, Web Crawler, Analytics Report System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2426

459 Analysis of Road Repairs in Undermined Areas

Authors: Tomáš Seidler, Marek Mihola, Denisa Cihlarova

Abstract:

The article presents analysis results of maps of expected subsidence in undermined areas for road repair management. The analysis was done in the area of Karvina district in the Czech Republic, including undermined areas with ongoing deep mining activities or finished deep mining in years 2003 - 2009. The article discusses the possibilities of local road maintenance authorities to determine areas that will need most repairs in the future with limited data available. Using the expected subsidence maps new map of surface curvature was calculated. Combined with road maps and historical data about repairs the result came for five main categories of undermined areas, proving very simple tool for management.

Keywords: GIS, Map of Subsidence, Road, Undermined Area

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1270

458 A Novel and Green Approach to Produce Nano- Porous Materials Zeolite A and MCM-41 from Coal Fly Ash and their Applications in Environmental Protection

Authors: K. S. Hui, K. N. Hui, Seong Kon Lee

Abstract:

Zeolite A and MCM-41 have extensive applications in basic science, petrochemical science, energy conservation/storage, medicine, chemical sensor, air purification, environmentally benign composite structure and waste remediation. However, the use of zeolite A and MCM-41 in these areas, especially environmental remediation, are restricted due to prohibitive production cost. Efficient recycling of and resource recovery from coal fly ash has been a major topic of current international research interest, aimed at achieving sustainable development of human society from the viewpoints of energy, economy, and environmental strategy. This project reported an original, novel, green and fast methods to produce nano-porous zeolite A and MCM-41 materials from coal fly ash. For zeolite A, this novel production method allows a reduction by half of the total production time while maintaining a high degree of crystallinity of zeolite A which exists in a narrower particle size distribution. For MCM-41, this remarkably green approach, being an environmentally friendly process and reducing generation of toxic waste, can produce pure and long-range ordered MCM-41 materials from coal fly ash. This approach took 24 h at 25 oC to produce 9 g of MCM-41 materials from 30 g of the coal fly ash, which is the shortest time and lowest reaction temperature required to produce pure and ordered MCM-41 materials (having the largest internal surface area) compared to the values reported in the literature. Performance evaluation of the produced zeolite A and MCM-41 materials in wastewater treatment and air pollution control were reported. The residual fly ash was also converted to zeolite Na-P1 which showed good performance in removal of multi-metal ions in wastewater. In wastewater treatment, compared to commercial-grade zeolite A, adsorbents produced from coal fly ash were effective in removing multi heavy metal ions in water and could be an alternative material for treatment of wastewater. In methane emission abatement, the zeolite A (produced from coal fly ash) achieved similar methane removal efficiency compared to the zeolite A prepared from pure chemicals. This report provides the guidance for production of zeolite A and MCM-41 from coal fly ash by a cost-effective approach which opens potential applications of these materials in environmental industry. Finally, environmental and economic aspects of production of zeolite A and MCM-41 from coal fly ash were discussed.

Keywords: Metal ions, waste water, methane, volatile organic compounds

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2210

457 Integrated Water Management for Lafarge Cement-Jordan

Authors: Azzam Hamaideh, Abbas Al-Omari, Michael Sturm

Abstract:

This study aims at implementing integrated water resources management principles to the Lafarge Cement Jordan at Al-Fuhais plant. This was accomplished by conducting water audits at all water consuming units in the plant. Based on the findings of the water audit, an action plan to improve water use efficiency in the plant was proposed. The main elements of which are installing water saving devices, re-use of the treated wastewater, water harvesting, raising the awareness of the employees, and linking the plant to the water demand management unit at the Ministry of Water and Irrigation.

The analysis showed that by implementing the proposed action plan, it is expected that the industrial water demand can be satisfied from non-conventional resources including treated wastewater and harvested water. As a consequence, fresh water can be used to increase the supply to Al-Fuhais city which is expected to reflect positively on the relationship between the factory and the city.

Keywords: Integrated water resources management, non-conventional water resources, water awareness, water demand management, water harvesting, water saving devices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2578

456 Impact of Coal Mining on River Sediment Quality in the Sydney Basin, Australia

Authors: A. Ali, V. Strezov, P. Davies, I. Wright, T. Kan

Abstract:

The environmental impacts arising from mining activities affect the air, water, and soil quality. Impacts may result in unexpected and adverse environmental outcomes. This study reports on the impact of coal production on sediment in Sydney region of Australia. The sediment samples upstream and downstream from the discharge points from three mines were taken, and 80 parameters were tested. The results were assessed against sediment quality based on presence of metals. The study revealed the increment of metal content in the sediment downstream of the reference locations. In many cases, the sediment was above the Australia and New Zealand Environment Conservation Council and international sediment quality guidelines value (SQGV). The major outliers to the guidelines were nickel (Ni) and zinc (Zn).

Keywords: Coal mine, environmental impact, produced water, sediment quality guidelines value.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1369

455 Flocculation on the Treatment of Olive Oil Mill Wastewater: Pretreatment

Authors: G. Hodaifa, J. A. Páez, C. Agabo, E. Ramos, J. C. Gutiérrez, A. Rosal

Abstract:

Currently, continuous two-phase decanter process used for olive oil production is the more internationally widespread. The wastewaters generated from this industry (OMW) are a real environmental problem because of its high organic load. Among proposed treatments for these wastewaters, advanced oxidation technologies (Fenton, ozone, photoFenton, etc.) are the most favourable. The direct application of these processes is somewhat expensive. Therefore, the application of a previous stage based on a flocculation-sedimentation operation is of high importance. In this research five commercial flocculants (three cationic, and two anionic) have been used to achieve the separation of phases (liquid clarifiedsludge). For each flocculant, different concentrations (0-1000 mg/L) have been studied. In these experiments, sludge volume formed and the final water quality were determined. The final removal percentages of total phenols (11.3-25.1%), COD (5.6-20.4%), total carbon (2.3-26.5%), total organic carbon (1.50-23.8%), total nitrogen (1.45-24.8%), and turbidity (27.9-61.4%) were determined. The variation on electric conductivity reduction percentage (1-8%) was also determined. Finally, the best flocculants with highest removal percentages have been determined (QG2001 and Flocudex CS49).

Keywords: Flocculants, flocculation, olive oil mill wastewater, water quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2509

454 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861

453 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967

452 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1851

451 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150

450 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: Recommendation, user profile, data mining, web technology, mobile technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2112

449 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

Authors: Chien-Hua Wang, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607

448 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 835

447 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3173

446 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO₃was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: Contamination, mining activities, surface water, trace metals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925

445 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543

444 Optimization of the Co-Precipitation of Industrial Waste Metals in a Continuous Reactor System

Authors: Thomas S. Abia II, Citlali Garcia-Saucedo

Abstract:

A continuous copper precipitation treatment (CCPT) system was conceived at Intel Chandler Site to serve as a first-of-kind (FOK) facility-scale waste copper (Cu), nickel (Ni), and manganese (Mn) co-precipitation facility. The process was designed to treat highly variable wastewater discharged from a substrate packaging research factory. The paper discusses metals co-precipitation induced by internal changes for manufacturing facilities that lack the capacity for hardware expansion due to real estate restrictions, aggressive schedules, or budgetary constraints. Herein, operating parameters such as pH and oxidation reduction potential (ORP) were examined to analyze the ability of the CCPT System to immobilize various waste metals. Additionally, influential factors such as influent concentrations and retention times were investigated to quantify the environmental variability against system performance. A total of 2,027 samples were analyzed and statistically evaluated to measure the performance of CCPT that was internally retrofitted for Mn abatement to meet environmental regulations. In order to enhance the consistency of the influent, a separate holding tank was cannibalized from another system to collect and slow-feed the segregated Mn wastewater from the factory into CCPT. As a result, the baseline influent Mn decreased from 17.2+18.7 mg¹L^-1 at pre-pilot to 5.15+8.11 mg¹L^-1 post-pilot (70.1% reduction). Likewise, the pre-trial and post-trial average influent Cu values to CCPT were 52.0+54.6 mg¹L^-1 and 33.9+12.7 mg¹L^-1, respectively (34.8% reduction). However, the raw Ni content of 0.97+0.39 mg¹L^-1 at pre-pilot increased to 1.06+0.17 mg¹L^-1 at post-pilot. The average Mn output declined from 10.9+11.7 mg¹L^-1 at pre-pilot to 0.44+1.33 mg¹L^-1 at post-pilot (96.0% reduction) as a result of the pH and ORP operating setpoint changes. In similar fashion, the output Cu quality improved from 1.60+5.38 mg¹L^-1 to 0.55+1.02 mg¹L^-1 (65.6% reduction) while the Ni output sustained a 50% enhancement during the pilot study (0.22+0.19 mg¹L^-1 reduced to 0.11+0.06 mg¹L^-1). pH and ORP were shown to be significantly instrumental to the precipitative versatility of the CCPT System.

Keywords: Copper, co-precipitation, industrial wastewater treatment, manganese, optimization, pilot study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 948

443 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918