Search results for: Data mining andInformation Extraction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8028

Search results for: Data mining andInformation Extraction

7758 A Novel Approach to Improve Users Search Goal in Web Usage Mining

Authors: R. Lokeshkumar, P. Sengottuvelan

Abstract:

Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.

Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975
7757 Edge-end Pixel Extraction for Edge-based Image Segmentation

Authors: Mahinda P. Pathegama, Özdemir Göl

Abstract:

Extraction of edge-end-pixels is an important step for the edge linking process to achieve edge-based image segmentation. This paper presents an algorithm to extract edge-end pixels together with their directional sensitivities as an augmentation to the currently available mathematical models. The algorithm is implemented in the Java environment because of its inherent compatibility with web interfaces since its main use is envisaged to be for remote image analysis on a virtual instrumentation platform.

Keywords: edge-end pixels, image processing, imagesegmentation, pixel extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2106
7756 Study on Extraction of Ceric Oxide from Monazite Concentrate

Authors: Lwin Thuzar Shwe, Nwe Nwe Soe, Kay Thi Lwin

Abstract:

Cerium oxide is to be recovered from monazite, which contains about 27.35% CeO2. The principal objective of this study is to be able to extract cerium oxide from monazite of Moemeik Myitsone Area. The treatment of monazite in this study involves three main steps; extraction of cerium hydroxide from monazite, solvent extraction of cerium hydroxide, and precipitation with oxalic acid and calcination of cerium oxalate.

Keywords: Calcination, Digestion, Precipitation, SolventExtraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2541
7755 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 720
7754 Using the Combined Model of PROMETHEE and Fuzzy Analytic Network Process for Determining Question Weights in Scientific Exams through Data Mining Approach

Authors: Hassan Haleh, Amin Ghaffari, Parisa Farahpour

Abstract:

Need for an appropriate system of evaluating students- educational developments is a key problem to achieve the predefined educational goals. Intensity of the related papers in the last years; that tries to proof or disproof the necessity and adequacy of the students assessment; is the corroborator of this matter. Some of these studies tried to increase the precision of determining question weights in scientific examinations. But in all of them there has been an attempt to adjust the initial question weights while the accuracy and precision of those initial question weights are still under question. Thus In order to increase the precision of the assessment process of students- educational development, the present study tries to propose a new method for determining the initial question weights by considering the factors of questions like: difficulty, importance and complexity; and implementing a combined method of PROMETHEE and fuzzy analytic network process using a data mining approach to improve the model-s inputs. The result of the implemented case study proves the development of performance and precision of the proposed model.

Keywords: Assessing students, Analytic network process, Clustering, Data mining, Fuzzy sets, Multi-criteria decision making, and Preference function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530
7753 BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.

Keywords: Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1905
7752 Quantification of GHGs Emissions from Electricity and Diesel Fuel Consumption in Basalt Mining Industry in Thailand

Authors: S. Kittipongvises, A. Dubsok

Abstract:

The mineral and mining industry is necessary for countries to have an adequate and reliable supply of materials to meet their socio-economic development. Despite its importance, the environmental impacts from mineral exploration are hugely significant. This study aimed to investigate and quantify the amount of GHGs emissions emitted from both electricity and diesel vehicle fuel consumption in basalt mining in Thailand. Plant A, located in the northeastern region of Thailand, was selected as a case study. Results indicated that total GHGs emissions from basalt mining and operation (Plant A) were approximately 2,501,086 kgCO2e and 1,997,412 kgCO2e in 2014 and 2015, respectively. The estimated carbon intensity ranged between 1.824 kgCO2e to 2.284 kgCO2e per ton of rock product. Scope 1 (direct emissions) was the dominant driver of its total GHGs compared to scope 2 (indirect emissions). As such, transport related combustion of diesel fuels generated the highest GHGs emission (65%) compared to emissions from purchased electricity (35%). Some of the potential implications for mining entities were also presented.

Keywords: Basalt mining, diesel fuel, electricity, GHGs emissions, Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1005
7751 Comparative Study of Decision Trees and Rough Sets Theory as Knowledge ExtractionTools for Design and Control of Industrial Processes

Authors: Marcin Perzyk, Artur Soroczynski

Abstract:

General requirements for knowledge representation in the form of logic rules, applicable to design and control of industrial processes, are formulated. Characteristic behavior of decision trees (DTs) and rough sets theory (RST) in rules extraction from recorded data is discussed and illustrated with simple examples. The significance of the models- drawbacks was evaluated, using simulated and industrial data sets. It is concluded that performance of DTs may be considerably poorer in several important aspects, compared to RST, particularly when not only a characterization of a problem is required, but also detailed and precise rules are needed, according to actual, specific problems to be solved.

Keywords: Knowledge extraction, decision trees, rough setstheory, industrial processes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
7750 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin has emerged as a compelling research area, capturing the attention of scholars over the past decade. It finds applications across diverse fields, including smart manufacturing and healthcare, offering significant time and cost savings. Notably, it often intersects with other cutting-edge technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, the concept of a Human Digital Twin (HDT) is still in its infancy and requires further demonstration of its practicality. HDT takes the notion of Digital Twin a step further by extending it to living entities, notably humans, who are vastly different from inanimate physical objects. The primary objective of this research was to create an HDT capable of automating real-time human responses by simulating human behavior. To achieve this, the study delved into various areas, including clustering, supervised classification, topic extraction, and sentiment analysis. The paper successfully demonstrated the feasibility of HDT for generating personalized responses in social messaging applications. Notably, the proposed approach achieved an overall accuracy of 63%, a highly promising result that could pave the way for further exploration of the HDT concept. The methodology employed Random Forest for clustering the question database and matching new questions, while K-nearest neighbor was utilized for sentiment analysis.

Keywords: Human Digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification and clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 103
7749 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: Classification, data mining, spam filtering, naive Bayes, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
7748 Synthesis and Use of Thiourea Derivative (1-Phenyl-3- Benzoyl-2-Thiourea) for Extraction of Cadmium Ion

Authors: Abdulfattah M. Alkherraz, Zaineb I. Lusta, Ahmed E. Zubi

Abstract:

The environmental pollution by heavy metals became  more problematic nowadays. To solve the problem of Cadmium  accumulation in human organs which lead to dangerous effects on  human health, and to determine its concentration, the organic legand  1-phenyl-3-benzoyl-2-thiourea was used to extract the cadmium ions  from its solution. This legand as one of thiourea derivatives was  successfully synthesized. The legand was characterized by NMR and  CHN elemental analysis, and used to extract the cadmium from its  solutions by formation of a stable complex at neutral pH. The  complex was characterized by elemental analysis and melting point.  The concentrations of cadmium ions before and after the extraction  were determined by Atomic Absorption Spectrophotometer (AAS).  The data show the percentage of the extract was more than 98.7% of  the concentration of cadmium used in the study

Keywords: Thiourea derivatives, cadmium extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7120
7747 n-Butanol as an Extractant for Lactic Acid Recovery

Authors: Kanungnit Chawong, Panarat Rattanaphanee

Abstract:

Extraction of lactic acid from aqueous solution using n-butanol as an extractant was studied. Effect of mixing time, pH of the aqueous solution, initial lactic acid concentration, and volume ratio between the organic and the aqueous phase were investigated. Distribution coefficient and degree of lactic acid extraction was found to increase when the pH of aqueous solution was decreased. The pH Effect was substantially pronounced at pH of the aqueous solution less than 1. Initial lactic acid concentration and organic-toaqueous volume ratio appeared to have positive effect on the distribution coefficient and the degree of extraction. Due to the nature of n-butanol that is partially miscible in water, incorporation of aqueous solution into organic phase was observed in the extraction with large organic-to-aqueous volume ratio.

Keywords: Lactic acid, liquid-liquid extraction, n-Butanol, Solvating extractant.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3110
7746 Optimization of Process Parameters using Response Surface Methodology for the Removal of Zinc(II) by Solvent Extraction

Authors: B. Guezzen, M.A. Didi, B. Medjahed

Abstract:

A factorial design of experiments and a response surface methodology were implemented to investigate the liquid-liquid extraction process of zinc (II) from acetate medium using the 1-Butyl-imidazolium di(2-ethylhexyl) phosphate [BIm+][D2EHP-]. The optimization process of extraction parameters such as the initial pH effect (2.5, 4.5, and 6.6), ionic liquid concentration (1, 5.5, and 10 mM) and salt effect (0.01, 5, and 10 mM) was carried out using a three-level full factorial design (33). The results of the factorial design demonstrate that all these factors are statistically significant, including the square effects of pH and ionic liquid concentration. The results showed that the order of significance: IL concentration > salt effect > initial pH. Analysis of variance (ANOVA) showing high coefficient of determination (R2 = 0.91) and low probability values (P < 0.05) signifies the validity of the predicted second-order quadratic model for Zn (II) extraction. The optimum conditions for the extraction of zinc (II) at the constant temperature (20 °C), initial Zn (II) concentration (1mM) and A/O ratio of unity were: initial pH (4.8), extractant concentration (9.9 mM), and NaCl concentration (8.2 mM). At the optimized condition, the metal ion could be quantitatively extracted.

Keywords: Ionic liquid, response surface methodology, solvent extraction, zinc acetate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1103
7745 A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing

Authors: Youngji Yoo, Seung Hwan Park, Daewoong An, Sung-Shick Kim, Jun-Geol Baek

Abstract:

The yield management system is very important to produce high-quality semiconductor chips in the semiconductor manufacturing process. In order to improve quality of semiconductors, various tests are conducted in the post fabrication (FAB) process. During the test process, large amount of data are collected and the data includes a lot of information about defect. In general, the defect on the wafer is the main causes of yield loss. Therefore, analyzing the defect data is necessary to improve performance of yield prediction. The wafer bin map (WBM) is one of the data collected in the test process and includes defect information such as the fail bit patterns. The fail bit has characteristics of spatial point patterns. Therefore, this paper proposes the feature extraction method using the spatial point pattern analysis. Actual data obtained from the semiconductor process is used for experiments and the experimental result shows that the proposed method is more accurately recognize the fail bit patterns.

Keywords: Semiconductor, wafer bin map (WBM), feature extraction, spatial point patterns, contour map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2450
7744 Benefits and Issues of Open-Cut Coal Mining on the Socio-Economic Environment - The Iban Community in Mukah, Sarawak, Malaysia

Authors: Edward Lim

Abstract:

This paper deals principally with the socio-economic impact on the local Iban community in Mukah Division, Sarawak; with the commencement of the open-cut coal mining industry since 2003. To-date there are no actual studies being carried out by either the public or private sector to truly analyze how the Iban community is coping with the advent of a large influx of cash into their society. The Iban community has traditionally been practicing shifting cultivation and farming of domesticated animals; with a portion of the younger generation working as laborers and professional. This paper represents the views and observations of the author supported by some statistical facts extracted from published articles and non-published reports. The paper deals primarily in the following areas: • Background of the coal mining industry in Mukah Division, Sarawak; • Benefits of the coal mining industry towards the Iban community; • Issues / Problems arise in the Iban community because of the presence of the coal mining industry; and • Possible actions that need to be taken to overcome these issues/ problems.

Keywords: Coal Mining, Iban Community, Malaysia, Sub-Bituminous Coal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2396
7743 An Application for Web Mining Systems with Services Oriented Architecture

Authors: Thiago M. R. Dias, Gray F. Moita, Paulo E. M. Almeida

Abstract:

Although the World Wide Web is considered the largest source of information there exists nowadays, due to its inherent dynamic characteristics, the task of finding useful and qualified information can become a very frustrating experience. This study presents a research on the information mining systems in the Web; and proposes an implementation of these systems by means of components that can be built using the technology of Web services. This implies that they can encompass features offered by a services oriented architecture (SOA) and specific components may be used by other tools, independent of platforms or programming languages. Hence, the main objective of this work is to provide an architecture to Web mining systems, divided into stages, where each step is a component that will incorporate the characteristics of SOA. The separation of these steps was designed based upon the existing literature. Interesting results were obtained and are shown here.

Keywords: Web Mining, Service Oriented Architecture, WebServices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1419
7742 Analysis of Road Repairs in Undermined Areas

Authors: Tomáš Seidler, Marek Mihola, Denisa Cihlarova

Abstract:

The article presents analysis results of maps of expected subsidence in undermined areas for road repair management. The analysis was done in the area of Karvina district in the Czech Republic, including undermined areas with ongoing deep mining activities or finished deep mining in years 2003 - 2009. The article discusses the possibilities of local road maintenance authorities to determine areas that will need most repairs in the future with limited data available. Using the expected subsidence maps new map of surface curvature was calculated. Combined with road maps and historical data about repairs the result came for five main categories of undermined areas, proving very simple tool for management.

Keywords: GIS, Map of Subsidence, Road, Undermined Area

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1260
7741 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline M. R. Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.

Keywords: Brazil, classifiers, data-mining, Image Segmentation, oil well visualization, classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2501
7740 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2066
7739 Response Surface Modeling of Lactic Acid Extraction by Emulsion Liquid Membrane: Box-Behnken Experimental Design

Authors: A. Thakur, P. S. Panesar, M. S. Saini

Abstract:

Extraction of lactic acid by emulsion liquid membrane technology (ELM) using n-trioctyl amine (TOA) in n-heptane as carrier within the organic membrane along with sodium carbonate as acceptor phase was optimized by using response surface methodology (RSM). A three level Box-Behnken design was employed for experimental design, analysis of the results and to depict the combined effect of five independent variables, vizlactic acid concentration in aqueous phase (cl), sodium carbonate concentration in stripping phase (cs), carrier concentration in membrane phase (ψ), treat ratio, and batch extraction time (τ)  with equal volume of organic and external aqueous phase on lactic acid extraction efficiency. The maximum lactic acid extraction efficiency (ηext) of 98.21%from aqueous phase in a batch reactor using ELM was found at the optimized values for test variables, cl, cs, ψ, and τ as 0.06 [M], 0.18 [M], 4.72 (%,v/v), 1.98 (v/v) and 13.36 min respectively. 

Keywords: Emulsion liquid membrane, extraction, lactic acid, n-trioctylamine, response surface methodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2282
7738 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 848
7737 Carrageenan Properties Extracted From Eucheuma cottonii, Indonesia

Authors: Sperisa Distantina, Wiratni , Moh. Fahrurrozi, Rochmadi

Abstract:

The effect of extraction solvent upon properties of carrageenan from Eucheuma cottonii was studied. The distilled water and KOH solution (concentration 0.1- 0.5N) were used as the solvent. Extraction process was carried out in water bath equipped by stirrer with constant speed of 275 rpm with a constant ratio of seaweed weight to solvent volume ( 1:50 g/mL) at 86oC for 45 minutes. The extract was then precipitated in 3 volume of 90% ethanol, oven dried at 60oC. Based on experimental data, alkali significantly influenced yield and properties of extracted carrageenan. The extracted carrageenan was found to have essentially identical FTIR spectra to the reference samples of kappa-carrageenan. Increasing the KOH concentration led to carrageenan containing less sulfate content and intrinsic viscosity. The gel strength increased along with the increasing of KOH concentration. The decreasing of intrinsic viscosity value indicates that a polymer degradation occurs during alkali extraction.

Keywords: gel strength, sulfate, intrinsic viscosity, Eucheumacottonii

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5959
7736 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts

Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah

Abstract:

Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.

Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2192
7735 Anomaly Based On Frequent-Outlier for Outbreak Detection in Public Health Surveillance

Authors: Zalizah Awang Long, Abdul Razak Hamdan, Azuraliza Abu Bakar

Abstract:

Public health surveillance system focuses on outbreak detection and data sources used. Variation or aberration in the frequency distribution of health data, compared to historical data is often used to detect outbreaks. It is important that new techniques be developed to improve the detection rate, thereby reducing wastage of resources in public health. Thus, the objective is to developed technique by applying frequent mining and outlier mining techniques in outbreak detection. 14 datasets from the UCI were tested on the proposed technique. The performance of the effectiveness for each technique was measured by t-test. The overall performance shows that DTK can be used to detect outlier within frequent dataset. In conclusion the outbreak detection technique using anomaly-based on frequent-outlier technique can be used to identify the outlier within frequent dataset.

Keywords: Outlier detection, frequent-outlier, outbreak, anomaly, surveillance, public health

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2223
7734 Automatic Extraction of Roads from High Resolution Aerial and Satellite Images with Heavy Noise

Authors: Yan Li, Ronald Briggs

Abstract:

Aerial and satellite images are information rich. They are also complex to analyze. For GIS systems, many features require fast and reliable extraction of roads and intersections. In this paper, we study efficient and reliable automatic extraction algorithms to address some difficult issues that are commonly seen in high resolution aerial and satellite images, nonetheless not well addressed in existing solutions, such as blurring, broken or missing road boundaries, lack of road profiles, heavy shadows, and interfering surrounding objects. The new scheme is based on a new method, namely reference circle, to properly identify the pixels that belong to the same road and use this information to recover the whole road network. This feature is invariable to the shape and direction of roads and tolerates heavy noise and disturbances. Road extraction based on reference circles is much more noise tolerant and flexible than the previous edge-detection based algorithms. The scheme is able to extract roads reliably from images with complex contents and heavy obstructions, such as the high resolution aerial/satellite images available from Google maps.

Keywords: Automatic road extraction, Image processing, Feature extraction, GIS update, Remote sensing, Geo-referencing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649
7733 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2549
7732 Kinetic and Removable of Amoxicillin Using Aliquat336 as a Carrier via a HFSLM

Authors: Teerapon Pirom, Ura Pancharoen

Abstract:

Amoxicillin is an antibiotic which is widely used to treat various infections in both human beings and animals. However, when amoxicillin is released into the environment, it is a major problem. Amoxicillin causes bacterial resistance to these drugs and failure of treatment with antibiotics. Liquid membrane is of great interest as a promising method for the separation and recovery of the target ions from aqueous solutions due to the use of carriers for the transport mechanism, resulting in highly selectivity and rapid transportation of the desired metal ions. The simultaneous processes of extraction and stripping in a single unit operation of liquid membrane system are very interesting. Therefore, it is practical to apply liquid membrane, particularly the HFSLM for industrial applications as HFSLM is proved to be a separation process with lower capital and operating costs, low energy and extractant with long life time, high selectivity and high fluxes compared with solid membranes. It is a simple design amenable to scaling up for industrial applications. The extraction and recovery for (Amoxicillin) through the hollow fiber supported liquid membrane (HFSLM) using aliquat336 as a carrier were explored with the experimental data. The important variables affecting on transport of amoxicillin viz. extractant concentration and operating time were investigated. The highest AMOX- extraction percentages of 85.35 and Amoxicillin stripping of 80.04 were achieved with the best condition at 6 mmol/L [aliquat336] and operating time 100 min. The extraction reaction order (n) and the extraction reaction rate constant (kf) were found to be 1.00 and 0.0344 min-1, respectively.

Keywords: Aliquat336, amoxicillin, HFSLM, kinetic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654
7731 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: Data mining, knowledge discovery in databases, prediction models, student success.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2449
7730 Optimization of Air Pollution Control Model for Mining

Authors: Zunaira Asif, Zhi Chen

Abstract:

The sustainable measures on air quality management are recognized as one of the most serious environmental concerns in the mining region. The mining operations emit various types of pollutants which have significant impacts on the environment. This study presents a stochastic control strategy by developing the air pollution control model to achieve a cost-effective solution. The optimization method is formulated to predict the cost of treatment using linear programming with an objective function and multi-constraints. The constraints mainly focus on two factors which are: production of metal should not exceed the available resources, and air quality should meet the standard criteria of the pollutant. The applicability of this model is explored through a case study of an open pit metal mine, Utah, USA. This method simultaneously uses meteorological data as a dispersion transfer function to support the practical local conditions. The probabilistic analysis and the uncertainties in the meteorological conditions are accomplished by Monte Carlo simulation. Reasonable results have been obtained to select the optimized treatment technology for PM2.5, PM10, NOx, and SO2. Additional comparison analysis shows that baghouse is the least cost option as compared to electrostatic precipitator and wet scrubbers for particulate matter, whereas non-selective catalytical reduction and dry-flue gas desulfurization are suitable for NOx and SO2 reduction respectively. Thus, this model can aid planners to reduce these pollutants at a marginal cost by suggesting control pollution devices, while accounting for dynamic meteorological conditions and mining activities.

Keywords: Air pollution, linear programming, mining, optimization, treatment technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528
7729 Rapid Expansion Supercritical Solution (RESS) Carbon Dioxide as an Environmental Friendly Method for Ginger Rhizome Solid Oil Particles Formation

Authors: N. A. Zainuddin, I. Norhuda, I. S. Adeib, A. N. Mustapa, S. H. Sarijo

Abstract:

Recently, RESS (Rapid Expansion Supercritical Solution) method has been used by researchers to produce fine particles for pharmaceutical drug substances. Since RESS technology acknowledges a lot of benefits compare to conventional method of ginger extraction, it is suggested to use this method to explore particle formation of bioactive compound from powder ginger. The objective of this research is to produce direct solid oil particles formation from ginger rhizome which contains valuable compounds by using RESS-CO2 process. RESS experiments were carried using extraction pressure of 3000, 4000, 5000, 6000 and 7000psi and at different extraction temperature of 40, 45, 50, 55, 60, 65 and 70°C for 40 minutes extraction time and contant flowrate (24ml/min). From the studies conducted, it was found that at extraction pressure 5000psi and temperature 40°C, the smallest particle size obtained was 2.22μm on 99 % reduction from the original size of 370μm.

Keywords: Particle size, RESS, solid oil particle, supercritical carbon dioxide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 911