Search results for: Data mining andInformation Extraction

7785 Study of Features for Hand-printed Recognition

Abstract:

The feature extraction method(s) used to recognize hand-printed characters play an important role in ICR applications. In order to achieve high recognition rate for a recognition system, the choice of a feature that suits for the given script is certainly an important task. Even if a new feature required to be designed for a given script, it is essential to know the recognition ability of the existing features for that script. Devanagari script is being used in various Indian languages besides Hindi the mother tongue of majority of Indians. This research examines a variety of feature extraction approaches, which have been used in various ICR/OCR applications, in context to Devanagari hand-printed script. The study is conducted theoretically and experimentally on more that 10 feature extraction methods. The various feature extraction methods have been evaluated on Devanagari hand-printed database comprising more than 25000 characters belonging to 43 alphabets. The recognition ability of the features have been evaluated using three classifiers i.e. k-NN, MLP and SVM.

Keywords: Features, Hand-printed, Devanagari, Classifier, Database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

7784 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018

Authors: M. Sitoe, O. Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: Evasion and retention, cross validation, bagging, stacking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 103

7783 PmSPARQL: Extended SPARQL for Multi-paradigm Path Extraction

Authors: Thabet Slimani, Boutheina Ben Yaghlane, Khaled Mellouli

Abstract:

In the last few years, the Semantic Web gained scientific acceptance as a means of relationships identification in knowledge base, widely known by semantic association. Query about complex relationships between entities is a strong requirement for many applications in analytical domains. In bioinformatics for example, it is critical to extract exchanges between proteins. Currently, the widely known result of such queries is to provide paths between connected entities from data graph. However, they do not always give good results while facing the user need by the best association or a set of limited best association, because they only consider all existing paths but ignore the path evaluation. In this paper, we present an approach for supporting association discovery queries. Our proposal includes (i) a query language PmSPRQL which provides a multiparadigm query expressions for association extraction and (ii) some quantification measures making easy the process of association ranking. The originality of our proposal is demonstrated by a performance evaluation of our approach on real world datasets.

Keywords: Association extraction, query Language, relationships, knowledge base, multi-paradigm query.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435

7782 Extraction of Forest Plantation Resources in Selected Forest of San Manuel, Pangasinan, Philippines Using LiDAR Data for Forest Status Assessment

Authors: Mark Joseph Quinto, Roan Beronilla, Guiller Damian, Eliza Camaso, Ronaldo Alberto

Abstract:

Forest inventories are essential to assess the composition, structure and distribution of forest vegetation that can be used as baseline information for management decisions. Classical forest inventory is labor intensive and time-consuming and sometimes even dangerous. The use of Light Detection and Ranging (LiDAR) in forest inventory would improve and overcome these restrictions. This study was conducted to determine the possibility of using LiDAR derived data in extracting high accuracy forest biophysical parameters and as a non-destructive method for forest status analysis of San Manual, Pangasinan. Forest resources extraction was carried out using LAS tools, GIS, Envi and .bat scripts with the available LiDAR data. The process includes the generation of derivatives such as Digital Terrain Model (DTM), Canopy Height Model (CHM) and Canopy Cover Model (CCM) in .bat scripts followed by the generation of 17 composite bands to be used in the extraction of forest classification covers using ENVI 4.8 and GIS software. The Diameter in Breast Height (DBH), Above Ground Biomass (AGB) and Carbon Stock (CS) were estimated for each classified forest cover and Tree Count Extraction was carried out using GIS. Subsequently, field validation was conducted for accuracy assessment. Results showed that the forest of San Manuel has 73% Forest Cover, which is relatively much higher as compared to the 10% canopy cover requirement. On the extracted canopy height, 80% of the tree’s height ranges from 12 m to 17 m. CS of the three forest covers based on the AGB were: 20819.59 kg/20x20 m for closed broadleaf, 8609.82 kg/20x20 m for broadleaf plantation and 15545.57 kg/20x20m for open broadleaf. Average tree counts for the tree forest plantation was 413 trees/ha. As such, the forest of San Manuel has high percent forest cover and high CS.

Keywords: Carbon stock, forest inventory, LiDAR, tree count.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1273

7781 Dynamic Features Selection for Heart Disease Classification

Authors: Walid MOUDANI

Abstract:

The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the Coronary Heart Disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts- knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.

Keywords: Multi-Classifier Decisions Tree, Features Reduction, Dynamic Programming, Rough Sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2526

7780 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: Chain code frequency, character recognition, feature extraction, features matching, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 746

7779 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data

Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz

Abstract:

The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.

Keywords: Data clustering, medical data, principal components analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493

7778 Extraction of Squalene from Lebanese Olive Oil

Authors: Henri El Zakhem, Christina Romanos, Charlie Bakhos, Hassan Chahal, Jessica Koura

Abstract:

Squalene is a valuable component of the oil composed of 30 carbon atoms and is mainly used for cosmetic materials. The main concern of this article is to study the Squalene composition in the Lebanese olive oil and to compare it with foreign oil results. To our knowledge, extraction of Squalene from the Lebanese olive oil has not been conducted before. Three different techniques were studied and experiments were performed on three brands of olive oil, Al Wadi Al Akhdar, Virgo Bio and Boulos. The techniques performed are the Fractional Crystallization, the Soxhlet and the Esterification. By comparing the results, it is found that the Lebanese oil contains squalene and Soxhlet method is the most effective between the three methods extracting about 6.5E-04 grams of Squalene per grams of olive oil.

Keywords: Squalene, extraction, crystallization, Soxhlet.‎

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2295

7777 Laboratory Scale Extraction of Sugar Cane using High Electric Field Pulses

Authors: M. N. Eshtiaghi, N. Yoswathana

Abstract:

The aim of this study was to extract sugar from sugarcane using high electric field pulse (HELP) as a non-thermal cell permeabilization method. The result of this study showed that it is possible to permeablize sugar cane cells using HELP at very short times (less than 10 sec.) and at room temperature. Increasing the field strength (from 0.5kV/cm to 2kV/cm) and pulse number (1 to 12) led to increasing the permeabilization of sugar cane cells. The energy consumption during HELP treatment of sugar cane (2.4 kJ/kg) was about 100 times less compared to thermal cell disintegration at 85 <=C (about 271.7 kJ/kg). In addition, it was possible to extract sugar cane at a moderate temperature (45 <=C) using HELP pretreatment. With combination of HELP pretreatment followed by thermal extraction at 75 <=C, extraction resulted in up to 3% more sugar (on the basis of total extractable sugar) compared to samples without HELP pretreatment.

Keywords: Cell permeabilization, High electric field pulses, Non-thermal processing, Sugar cane extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2741

7776 Performance Evaluation of Data Mining Techniques for Predicting Software Reliability

Authors: Pradeep Kumar, Abdul Wahid

Abstract:

Accurate software reliability prediction not only enables developers to improve the quality of software but also provides useful information to help them for planning valuable resources. This paper examines the performance of three well-known data mining techniques (CART, TreeNet and Random Forest) for predicting software reliability. We evaluate and compare the performance of proposed models with Cascade Correlation Neural Network (CCNN) using sixteen empirical databases from the Data and Analysis Center for Software. The goal of our study is to help project managers to concentrate their testing efforts to minimize the software failures in order to improve the reliability of the software systems. Two performance measures, Normalized Root Mean Squared Error (NRMSE) and Mean Absolute Errors (MAE), illustrate that CART model is accurate than the models predicted using Random Forest, TreeNet and CCNN in all datasets used in our study. Finally, we conclude that such methods can help in reliability prediction using real-life failure datasets.

Keywords: Classification, Cascade Correlation Neural Network, Random Forest, Software reliability, TreeNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1834

7775 Water Budget in High Drought-Borne Area in Jaffna District, Sri Lanka during Dry Season

Authors: R. Kandiah, K. Miyamoto

Abstract:

In Sri Lanka, the Jaffna area is a high drought affected area and depends mainly on groundwater aquifers for water needs. Water for daily activities is extracted from wells. As households manually extract water from the wells, it is not drawn from mid evening to early morning. The water inflow at night provides the maximum water level that decreases during the daytime due to extraction. The storage volume of water in wells is limited or at its lowest level during the dry season. This study analyzes the domestic water budget during the dry season in the Jaffna area. In order to evaluate the water inflow rate into wells, storage volume and extraction volume from wells over time, water pressure is measured at the bottom of three wells, which are located in coastal area denoted as well A, in nonspecific area denoted as well B, and agricultural area denoted as well C. The water quality at the wells A, B, and C, are mostly fresh, modest fresh, and saline respectively. From the monitoring, we can find that the daily inflow amount of water into the wells and daily water extraction depend on each other, that is, higher extraction yields higher inflow. And, in the dry season, the daily inflow volume and the daily extraction volume of each well are almost in balance.

Keywords: Domestic water, water balance, water budget, ground water, shallow well.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1216

7774 Texture Feature Extraction using Slant-Hadamard Transform

Authors: M. J. Nassiri, A. Vafaei, A. Monadjemi

Abstract:

Random and natural textures classification is still one of the biggest challenges in the field of image processing and pattern recognition. In this paper, texture feature extraction using Slant Hadamard Transform was studied and compared to other signal processing-based texture classification schemes. A parametric SHT was also introduced and employed for natural textures feature extraction. We showed that a subtly modified parametric SHT can outperform ordinary Walsh-Hadamard transform and discrete cosine transform. Experiments were carried out on a subset of Vistex random natural texture images using a kNN classifier.

Keywords: Texture Analysis, Slant Transform, Hadamard, DCT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2664

7773 Phenolic Content and Antioxidant Activity Determination in Broccoli and Lamb’s Lettuce

Authors: C. P. Parente, M. J. Reis Lima, E. Teixeira-Lemos, M. M. Moreira, Aquiles A. Barros, Luís F. Guido

Abstract:

Broccoli has been widely recognized as a wealthy vegetable which contains multiple nutrients with potent anti-cancer properties. Lamb’s lettuce has been used as food for many centuries but only recently became commercially available and literature is therefore exiguous concerning these vegetables. The aim of this work was to evaluate the influence of the extraction conditions on the yield of phenolic compounds and the corresponding antioxidant capacity of broccoli and lamb’s lettuce. The results indicate that lamb’s lettuce, compared to broccoli, contains simultaneously a large amount of total polyphenols as well as high antioxidant activity. It is clearly demonstrated that extraction solvent significantly influences the antioxidant activity. Methanol is the solvent that can globally maximize the antioxidant extraction yield. The results presented herein prove lamb’s lettuce as a very interesting source of polyphenols, and thus a potential health-promoting food.

Keywords: Broccoli, lamb’s lettuce, extraction, antioxidant activity, phenolic compounds.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3812

7772 Redesigning Business Processes: A Method Based on Simulation and Process Mining Techniques

Authors: Zahra Mohammadnazari, Fateme Rostambeygi, Fatemeh Dehrouyeh, Hwang Ki-Soon, Amir Aghsami

Abstract:

Corporations have always prioritized efforts to examine and improve processes. Various metrics, such as the cost and time required to implement the process and can be specified in this regard. Process improvement can be defined as an improvement of these indicators. This is accomplished by looking at prospective adjustments to the current executive process model or the resources allotted to it. Research has been conducted in this paper to the improve the procurement process and aims to explore assessment prospects in the project using a combination of process mining and simulation (benefiting from Play-In and Play-Out methodologies). To run the simulation, we will need to complete the control flow diagram, institution settings, resource settings, and activity settings. The process of mining event logs yields the process control flow. However, both the entry of institutions and the distribution of resources must be modeled. The rate of admission of institutions and the distribution of time for the implementation of activities will be determined in the next step.

Keywords: Business reengineering, Petri net, process-based simulation, process mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 469

7771 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493

7770 Moving Vehicles Detection Using Automatic Background Extraction

Authors: Saad M. Al-Garni, Adel A. Abdennour

Abstract:

Vehicle detection is the critical step for highway monitoring. In this paper we propose background subtraction and edge detection technique for vehicle detection. This technique uses the advantages of both approaches. The practical applications approved the effectiveness of this method. This method consists of two procedures: First, automatic background extraction procedure, in which the background is extracted automatically from the successive frames; Second vehicles detection procedure, which depend on edge detection and background subtraction. Experimental results show the effective application of this algorithm. Vehicles detection rate was higher than 91%.

Keywords: Image processing, Automatic background extraction, Moving vehicle detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2417

7769 Effects of Pressure and Temperature on the Extraction of Benzyl Isothiocyanate by Supercritical Fluids from Tropaeolum majus L. Leaves

Authors: Espinoza S. Clara, Gamarra Q. Flor, Marianela F. Ramos Quispe S. Miguel, Flores R. Omar

Abstract:

Tropaeolum majus L. is a native plant to South and Central America, used since ancient times by our ancestors to combat different diseases. Glucotropaeolonin is one of its main components, which when hydrolyzed, forms benzyl isothiocyanate (BIT) that promotes cellular apoptosis (programmed cell death in cancer cells). Therefore, the present research aims to evaluate the effect of the pressure and temperature of BIT extraction by supercritical CO₂ from Tropaeolum majus L. The extraction was carried out in a supercritical fluid extractor equipment Speed SFE BASIC Brand: Poly science, the leaves of Tropaeolum majus L. were ground for one hour and lyophilized until obtaining a humidity of 6%. The extraction with supercritical CO₂ was carried out with pressures of 200 bar and 300 bar, temperatures of 50°C, 60°C and 70°C, obtained by the conjugation of these six treatments. BIT was identified by thin layer chromatography using 98% BIT as the standard, and as the mobile phase hexane: dichloromethane (4:2). Subsequently, BIT quantification was performed by high performance liquid chromatography (HPLC). The highest yield of oleoresin by supercritical CO₂ extraction was obtained pressure 300 bar and temperature at 60°C; and the higher content of BIT at pressure 200 bar and 70°C for 30 minutes to obtain 113.615 ± 0.03 mg BIT/100 g dry matter was obtained.

Keywords: Tropaeolum majus L., supercritical fluids, benzyl isothiocyanate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 872

7768 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2550

7767 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21^st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: Clustering analysis, community of practice, data mining, higher education, new faculty challenges, social networks, social influence, professional development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 968

7766 Evaluation of the Performance of ACTIFLO® Clarifier in the Treatment of Mining Wastewaters: Case Study of Costerfield Mining Operations, Victoria, Australia

Authors: Seyed Mohsen Samaei, Shirley Gato-Trinidad

Abstract:

A pre-treatment stage prior to reverse osmosis (RO) is very important to ensure the long-term performance of the RO membranes in any wastewater treatment using RO. This study aims to evaluate the application of the Actiflo^® clarifier as part of a pre-treatment unit in mining operations. It involves performing analytical testing on RO feed water before and after installation of Actiflo^® unit. Water samples prior to RO plant stage were obtained on different dates from Costerfield mining operations in Victoria, Australia. Tests were conducted in an independent laboratory to determine the concentration of various compounds in RO feed water before and after installation of Actiflo^® unit during the entire evaluated period from December 2015 to June 2018. Water quality analysis shows that the quality of RO feed water has remarkably improved since installation of Actiflo^® clarifier. Suspended solids (SS) and turbidity removal efficiencies has been improved by 91 and 85 percent respectively in pre-treatment system since the installation of Actiflo^®. The Actiflo^®clarifier proved to be a valuable part of pre-treatment system prior to RO. It has the potential to conveniently condition the mining wastewater prior to RO unit, and reduce the risk of RO physical failure and irreversible fouling. Consequently, reliable and durable operation of RO unit with minimum requirement for RO membrane replacement is expected with Actiflo^® in use.

Keywords: Actiflo® clarifier, membrane, mining wastewater, reverse osmosis, wastewater treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1185

7765 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: Client classification, loan suitability, risk rating, CART analysis, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1066

7764 Precious and Rare Metals in Overburden Carbonaceous Rocks: Methods of Extraction

Authors: Tatyana Alexandrova, Alexandr Alexandrov, Nadezhda Nikolaeva

Abstract:

A problem of complex mineral resources development is urgent and priority, it is aimed at realization of the processes of their ecologically safe development, one of its components is revealing the influence of the forms of element compounds in raw materials and in the processing products. In view of depletion of the precious metal reserves at the traditional deposits in the XXI century the large-size open cast deposits, localized in black shale strata begin to play the leading role. Carbonaceous (black) shales carry a heightened metallogenic potential. Black shales with high content of carbon are widely distributed within the scope of Bureinsky massif. According to academician Hanchuk`s data black shales of Sutirskaya series contain generally PGEs native form. The presence of high absorptive towards carbonaceous matter gold and PGEs compounds in crude ore results in decrease of valuable components extraction because of their sorption into dissipated carbonaceous matter.

Keywords: Сarbonaceous rocks, bitumens, precious metals, concentration, extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642

7763 Clinical Decision Support for Disease Classification based on the Tests Association

Authors: Sung Ho Ha, Seong Hyeon Joo, Eun Kyung Kwon

Abstract:

Until recently, researchers have developed various tools and methodologies for effective clinical decision-making. Among those decisions, chest pain diseases have been one of important diagnostic issues especially in an emergency department. To improve the ability of physicians in diagnosis, many researchers have developed diagnosis intelligence by using machine learning and data mining. However, most of the conventional methodologies have been generally based on a single classifier for disease classification and prediction, which shows moderate performance. This study utilizes an ensemble strategy to combine multiple different classifiers to help physicians diagnose chest pain diseases more accurately than ever. Specifically the ensemble strategy is applied by using the integration of decision trees, neural networks, and support vector machines. The ensemble models are applied to real-world emergency data. This study shows that the performance of the ensemble models is superior to each of single classifiers.

Keywords: Diagnosis intelligence, ensemble approach, data mining, emergency department

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628

7762 Leaching Characteristics of Upgraded Copper Flotation Tailings

Authors: Mercy M. Ramakokovhu, Henry Kasaini, Richard K.K. Mbaya

Abstract:

The copper flotation tailings from Konkola Copper mine in Nchanga, Zambia were used in the study. The purpose of this study was to determine the leaching characteristics of the tailings material prior and after the physical beneficiation process is employed. The Knelson gravity concentrator (KC-MD3) was used for the beneficiation process. The copper leaching efficiencies and impurity co-extraction percentages in both the upgraded and the raw feed material were determined at different pH levels and temperature. It was observed that the copper extraction increased with an increase in temperature and a decrease in pH levels. In comparison to the raw feed sample, the upgraded sample reported a maximum copper extraction of 69% which was 9%, higher than raw feed % extractions. The impurity carry over was reduced from 18% to 4 % on the upgraded sample. The reduction in impurity co-extraction was as a result of the removal of the reactive gangue elements during the upgrading process, this minimized the number of side reaction occurring during leaching.

Keywords: Atmospheric leaching, Copper, Iron, Knelson concentrator

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2898

7761 Scale-Space Volume Descriptors for Automatic 3D Facial Feature Extraction

Authors: Daniel Chen, George Mamic, Clinton Fookes, Sridha Sridharan

Abstract:

An automatic method for the extraction of feature points for face based applications is proposed. The system is based upon volumetric feature descriptors, which in this paper has been extended to incorporate scale space. The method is robust to noise and has the ability to extract local and holistic features simultaneously from faces stored in a database. Extracted features are stable over a range of faces, with results indicating that in terms of intra-ID variability, the technique has the ability to outperform manual landmarking.

Keywords: Scale space volume descriptor, feature extraction, 3D facial landmarking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499

7760 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1275

7759 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1614

7758 Bamboo Fibre Extraction and Its Reinforced Polymer Composite Material

Authors: P. Zakikhani, R. Zahari, M. T. H. Sultan, D. L. Majid

Abstract:

Natural plant fibres reinforced polymeric composite materials have been used in many fields of our lives to save the environment. Especially, bamboo fibres due to its environmental sustainability, mechanical properties, and recyclability have been utilized as reinforced polymer matrix composite in construction industries. In this review study bamboo structure and three different methods such as mechanical, chemical and combination of mechanical and chemical to extract fibres from bamboo are summarized. Each extraction method has been done base on the application of bamboo. In addition Bamboo fibre is compared with glass fibre from various aspects and in some parts it has advantages over the glass fibre.

Keywords: Bamboo fibres, natural fibres, mechanical extraction, glass fibres.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10323

7757 An Efficient Graph Query Algorithm Based on Important Vertices and Decision Features

Authors: Xiantong Li, Jianzhong Li

Abstract:

Graph has become increasingly important in modeling complicated structures and schemaless data such as proteins, chemical compounds, and XML documents. Given a graph query, it is desirable to retrieve graphs quickly from a large database via graph-based indices. Different from the existing methods, our approach, called VFM (Vertex to Frequent Feature Mapping), makes use of vertices and decision features as the basic indexing feature. VFM constructs two mappings between vertices and frequent features to answer graph queries. The VFM approach not only provides an elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit from data mining, especially frequent pattern mining. The results show that the proposed method not only avoids the enumeration method of getting subgraphs of query graph, but also effectively reduces the subgraph isomorphism tests between the query graph and graphs in candidate answer set in verification stage.

Keywords: Decision Feature, Frequent Feature, Graph Dataset, Graph Query

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1863

7756 Key Frame Based Video Summarization via Dependency Optimization

Authors: Janya Sainui

Abstract:

As a rapid growth of digital videos and data communications, video summarization that provides a shorter version of the video for fast video browsing and retrieval is necessary. Key frame extraction is one of the mechanisms to generate video summary. In general, the extracted key frames should both represent the entire video content and contain minimum redundancy. However, most of the existing approaches heuristically select key frames; hence, the selected key frames may not be the most different frames and/or not cover the entire content of a video. In this paper, we propose a method of video summarization which provides the reasonable objective functions for selecting key frames. In particular, we apply a statistical dependency measure called quadratic mutual informaion as our objective functions for maximizing the coverage of the entire video content as well as minimizing the redundancy among selected key frames. The proposed key frame extraction algorithm finds key frames as an optimization problem. Through experiments, we demonstrate the success of the proposed video summarization approach that produces video summary with better coverage of the entire video content while less redundancy among key frames comparing to the state-of-the-art approaches.

Keywords: Video summarization, key frame extraction, dependency measure, quadratic mutual information, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 958