Search results for: unsupervised feature extraction.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1506

Search results for: unsupervised feature extraction.

1266 Study on Extraction of Ceric Oxide from Monazite Concentrate

Authors: Lwin Thuzar Shwe, Nwe Nwe Soe, Kay Thi Lwin

Abstract:

Cerium oxide is to be recovered from monazite, which contains about 27.35% CeO2. The principal objective of this study is to be able to extract cerium oxide from monazite of Moemeik Myitsone Area. The treatment of monazite in this study involves three main steps; extraction of cerium hydroxide from monazite, solvent extraction of cerium hydroxide, and precipitation with oxalic acid and calcination of cerium oxalate.

Keywords: Calcination, Digestion, Precipitation, SolventExtraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2547
1265 Unsupervised Segmentation by Hidden Markov Chain with Bi-dimensional Observed Process

Authors: Abdelali Joumad, Abdelaziz Nasroallah

Abstract:

In unsupervised segmentation context, we propose a bi-dimensional hidden Markov chain model (X,Y) that we adapt to the image segmentation problem. The bi-dimensional observed process Y = (Y 1, Y 2) is such that Y 1 represents the noisy image and Y 2 represents a noisy supplementary information on the image, for example a noisy proportion of pixels of the same type in a neighborhood of the current pixel. The proposed model can be seen as a competitive alternative to the Hilbert-Peano scan. We propose a bayesian algorithm to estimate parameters of the considered model. The performance of this algorithm is globally favorable, compared to the bi-dimensional EM algorithm through numerical and visual data.

Keywords: Image segmentation, Hidden Markov chain with a bi-dimensional observed process, Peano-Hilbert scan, Bayesian approach, MCMC methods, Bi-dimensional EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1560
1264 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques

Authors: Faisal Alshuwaier, Ali Areshey

Abstract:

Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound (BB) method to simplify the texts.

Keywords: Extraction, Max-Prod, Fuzzy Relations, Text Mining, Memberships, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2136
1263 Exploiting Global Self Similarity for Head-Shoulder Detection

Authors: Lae-Jeong Park, Jung-Ho Moon

Abstract:

People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.

Keywords: Pedestrian detection, cascade of rejecters, feature extraction, self-symmetry, HOG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2356
1262 A Family of Distributions on Learnable Problems without Uniform Convergence

Authors: César Garza

Abstract:

In supervised binary classification and regression problems, it is well-known that learnability is equivalent to uniform convergence of the hypothesis class, and if a problem is learnable, it is learnable by empirical risk minimization. For the general learning setting of unsupervised learning tasks, there are non-trivial learning problems where uniform convergence does not hold. We present here the task of learning centers of mass with an extra feature that “activates” some of the coordinates over the unit ball in a Hilbert space. We show that the learning problem is learnable under a stable RLM rule. We introduce a family of distributions over the domain space with some mild restrictions for which the sample complexity of uniform convergence for these problems must grow logarithmically with the dimension of the Hilbert space. If we take this dimension to infinity, we obtain a learnable problem for which the uniform convergence property fails for a vast family of distributions.

Keywords: Statistical learning theory, learnability, uniform convergence, stability, regularized loss minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 257
1261 Lung Cancer Detection and Multi Level Classification Using Discrete Wavelet Transform Approach

Authors: V. Veeraprathap, G. S. Harish, G. Narendra Kumar

Abstract:

Uncontrolled growth of abnormal cells in the lung in the form of tumor can be either benign (non-cancerous) or malignant (cancerous). Patients with Lung Cancer (LC) have an average of five years life span expectancy provided diagnosis, detection and prediction, which reduces many treatment options to risk of invasive surgery increasing survival rate. Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) for earlier detection of cancer are common. Gaussian filter along with median filter used for smoothing and noise removal, Histogram Equalization (HE) for image enhancement gives the best results without inviting further opinions. Lung cavities are extracted and the background portion other than two lung cavities is completely removed with right and left lungs segmented separately. Region properties measurements area, perimeter, diameter, centroid and eccentricity measured for the tumor segmented image, while texture is characterized by Gray-Level Co-occurrence Matrix (GLCM) functions, feature extraction provides Region of Interest (ROI) given as input to classifier. Two levels of classifications, K-Nearest Neighbor (KNN) is used for determining patient condition as normal or abnormal, while Artificial Neural Networks (ANN) is used for identifying the cancer stage is employed. Discrete Wavelet Transform (DWT) algorithm is used for the main feature extraction leading to best efficiency. The developed technology finds encouraging results for real time information and on line detection for future research.

Keywords: ANN, DWT, GLCM, KNN, ROI, artificial neural networks, discrete wavelet transform, gray-level co-occurrence matrix, k-nearest neighbor, region of interest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 904
1260 Image Analysis for Obturator Foramen Based on Marker-Controlled Watershed Segmentation and Zernike Moments

Authors: Seda Sahin, Emin Akata

Abstract:

Obturator Foramen is a specific structure in Pelvic bone images and recognition of it is a new concept in medical image processing. Moreover, segmentation of bone structures such as Obturator Foramen plays an essential role for clinical research in orthopedics. In this paper, we present a novel method to analyze the similarity between the substructures of the imaged region and a hand drawn template as a preprocessing step for computation of Pelvic bone rotation on hip radiographs. This method consists of integrated usage of Marker-controlled Watershed segmentation and Zernike moment feature descriptor and it is used to detect Obturator Foramen accurately. Marker-controlled Watershed segmentation is applied to separate Obturator Foramen from the background effectively. Then, Zernike moment feature descriptor is used to provide matching between binary template image and the segmented binary image for final extraction of Obturator Foramens. Finally, Pelvic bone rotation rate calculation for each hip radiograph is performed automatically to select and eliminate hip radiographs for further studies which depend on Pelvic bone angle measurements. The proposed method is tested on randomly selected 100 hip radiographs. The experimental results demonstrated that the proposed method is able to segment Obturator Foramen with 96% accuracy.

Keywords: Medical image analysis, marker-controlled watershed segmentation, segmentation of bone structures on hip radiographs, pelvic bone rotation rate, zernike moment feature descriptor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1944
1259 Multi-Agent Systems for Intelligent Clustering

Authors: Jung-Eun Park, Kyung-Whan Oh

Abstract:

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

Keywords: Intelligent Clustering, Multi-Agent System, PCA, SOM, VC(Variance Criterion)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
1258 Using Data Fusion for Biometric Verification

Authors: Richard A. Wasniowski

Abstract:

A wide spectrum of systems require reliable personal recognition schemes to either confirm or determine the identity of an individual person. This paper considers multimodal biometric system and their applicability to access control, authentication and security applications. Strategies for feature extraction and sensor fusion are considered and contrasted. Issues related to performance assessment, deployment and standardization are discussed. Finally future directions of biometric systems development are discussed.

Keywords: Multimodal, biometric, recognition, fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716
1257 Efficient and Effective Gabor Feature Representation for Face Detection

Authors: Yasuomi D. Sato, Yasutaka Kuriya

Abstract:

We here propose improved version of elastic graph matching (EGM) as a face detector, called the multi-scale EGM (MS-EGM). In this improvement, Gabor wavelet-based pyramid reduces computational complexity for the feature representation often used in the conventional EGM, but preserving a critical amount of information about an image. The MS-EGM gives us higher detection performance than Viola-Jones object detection algorithm of the AdaBoost Haar-like feature cascade. We also show rapid detection speeds of the MS-EGM, comparable to the Viola-Jones method. We find fruitful benefits in the MS-EGM, in terms of topological feature representation for a face.

Keywords: Face detection, Gabor wavelet based pyramid, elastic graph matching, topological preservation, redundancy of computational complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1825
1256 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow

Authors: Jungho Choi, Youngwan Cho

Abstract:

The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.

Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2595
1255 n-Butanol as an Extractant for Lactic Acid Recovery

Authors: Kanungnit Chawong, Panarat Rattanaphanee

Abstract:

Extraction of lactic acid from aqueous solution using n-butanol as an extractant was studied. Effect of mixing time, pH of the aqueous solution, initial lactic acid concentration, and volume ratio between the organic and the aqueous phase were investigated. Distribution coefficient and degree of lactic acid extraction was found to increase when the pH of aqueous solution was decreased. The pH Effect was substantially pronounced at pH of the aqueous solution less than 1. Initial lactic acid concentration and organic-toaqueous volume ratio appeared to have positive effect on the distribution coefficient and the degree of extraction. Due to the nature of n-butanol that is partially miscible in water, incorporation of aqueous solution into organic phase was observed in the extraction with large organic-to-aqueous volume ratio.

Keywords: Lactic acid, liquid-liquid extraction, n-Butanol, Solvating extractant.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3116
1254 Optimization of Process Parameters using Response Surface Methodology for the Removal of Zinc(II) by Solvent Extraction

Authors: B. Guezzen, M.A. Didi, B. Medjahed

Abstract:

A factorial design of experiments and a response surface methodology were implemented to investigate the liquid-liquid extraction process of zinc (II) from acetate medium using the 1-Butyl-imidazolium di(2-ethylhexyl) phosphate [BIm+][D2EHP-]. The optimization process of extraction parameters such as the initial pH effect (2.5, 4.5, and 6.6), ionic liquid concentration (1, 5.5, and 10 mM) and salt effect (0.01, 5, and 10 mM) was carried out using a three-level full factorial design (33). The results of the factorial design demonstrate that all these factors are statistically significant, including the square effects of pH and ionic liquid concentration. The results showed that the order of significance: IL concentration > salt effect > initial pH. Analysis of variance (ANOVA) showing high coefficient of determination (R2 = 0.91) and low probability values (P < 0.05) signifies the validity of the predicted second-order quadratic model for Zn (II) extraction. The optimum conditions for the extraction of zinc (II) at the constant temperature (20 °C), initial Zn (II) concentration (1mM) and A/O ratio of unity were: initial pH (4.8), extractant concentration (9.9 mM), and NaCl concentration (8.2 mM). At the optimized condition, the metal ion could be quantitatively extracted.

Keywords: Ionic liquid, response surface methodology, solvent extraction, zinc acetate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1108
1253 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3097
1252 A Universal Model for Content-Based Image Retrieval

Authors: S. Nandagopalan, Dr. B. S. Adiga, N. Deepak

Abstract:

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Keywords: Content Based Image Retrieval (CBIR), Cooccurrencematrix, Feature vector, Edge Histogram Descriptor(EHD), Greedy strategy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2886
1251 sEMG Interface Design for Locomotion Identification

Authors: Rohit Gupta, Ravinder Agarwal

Abstract:

Surface electromyographic (sEMG) signal has the potential to identify the human activities and intention. This potential is further exploited to control the artificial limbs using the sEMG signal from residual limbs of amputees. The paper deals with the development of multichannel cost efficient sEMG signal interface for research application, along with evaluation of proposed class dependent statistical approach of the feature selection method. The sEMG signal acquisition interface was developed using ADS1298 of Texas Instruments, which is a front-end interface integrated circuit for ECG application. Further, the sEMG signal is recorded from two lower limb muscles for three locomotions namely: Plane Walk (PW), Stair Ascending (SA), Stair Descending (SD). A class dependent statistical approach is proposed for feature selection and also its performance is compared with 12 preexisting feature vectors. To make the study more extensive, performance of five different types of classifiers are compared. The outcome of the current piece of work proves the suitability of the proposed feature selection algorithm for locomotion recognition, as compared to other existing feature vectors. The SVM Classifier is found as the outperformed classifier among compared classifiers with an average recognition accuracy of 97.40%. Feature vector selection emerges as the most dominant factor affecting the classification performance as it holds 51.51% of the total variance in classification accuracy. The results demonstrate the potentials of the developed sEMG signal acquisition interface along with the proposed feature selection algorithm.

Keywords: Classifiers, feature selection, locomotion, sEMG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
1250 A Survey: Clustering Ensembles Techniques

Authors: Reza Ghaemi , Md. Nasir Sulaiman , Hamidah Ibrahim , Norwati Mustapha

Abstract:

The clustering ensembles combine multiple partitions generated by different clustering algorithms into a single clustering solution. Clustering ensembles have emerged as a prominent method for improving robustness, stability and accuracy of unsupervised classification solutions. So far, many contributions have been done to find consensus clustering. One of the major problems in clustering ensembles is the consensus function. In this paper, firstly, we introduce clustering ensembles, representation of multiple partitions, its challenges and present taxonomy of combination algorithms. Secondly, we describe consensus functions in clustering ensembles including Hypergraph partitioning, Voting approach, Mutual information, Co-association based functions and Finite mixture model, and next explain their advantages, disadvantages and computational complexity. Finally, we compare the characteristics of clustering ensembles algorithms such as computational complexity, robustness, simplicity and accuracy on different datasets in previous techniques.

Keywords: Clustering Ensembles, Combinational Algorithm, Consensus Function, Unsupervised Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3375
1249 Study of Equilibrium and Mass Transfer of Co- Extraction of Different Mineral Acids with Iron(III) from Aqueous Solution by Tri-n-Butyl Phosphate Using Liquid Membrane

Authors: Diptendu Das, Vikas Kumar Rahi, V. A. Juvekar, R. Bhattacharya

Abstract:

Extraction of Fe(III) from aqueous solution using Trin- butyl Phosphate (TBP) as carrier needs a highly acidic medium (>6N) as it favours formation of chelating complex FeCl3.TBP. Similarly, stripping of Iron(III) from loaded organic solvents requires neutral pH or alkaline medium to dissociate the same complex. It is observed that TBP co-extracts acids along with metal, which causes reversal of driving force of extraction and iron(III) is re-extracted back from the strip phase into the feed phase during Liquid Emulsion Membrane (LEM) pertraction. Therefore, rate of extraction of different mineral acids (HCl, HNO3, H2SO4) using TBP with and without presence of metal Fe(III) was examined. It is revealed that in presence of metal acid extraction is enhanced. Determination of mass transfer coefficient of both acid and metal extraction was performed by using Bulk Liquid Membrane (BLM). The average mass transfer coefficient was obtained by fitting the derived model equation with experimentally obtained data. The mass transfer coefficient of the mineral acid extraction is in the order of kHNO3 = 3.3x10-6m/s > kHCl = 6.05x10-7m/s > kH2SO4 = 1.85x10-7m/s. The distribution equilibria of the above mentioned acids between aqueous feed solution and a solution of tri-n-butyl-phosphate (TBP) in organic solvents have been investigated. The stoichiometry of acid extraction reveals the formation of TBP.2HCl, HNO3.2TBP, and TBP.H2SO4 complexes. Moreover, extraction of Iron(III) by TBP in HCl aqueous solution forms complex FeCl3.TBP.2HCl while in HNO3 medium forms complex 3FeCl3.TBP.2HNO3

Keywords: Bulk Liquid Membrane (BLM) Transport, Iron(III) extraction, Tri-n-butyl Phosphate, Mass Transfer coefficient.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2532
1248 Accelerating Quantum Chemistry Calculations: Machine Learning for Efficient Evaluation of Electron-Repulsion Integrals

Authors: Nishant Rodrigues, Nicole Spanedda, Chilukuri K. Mohan, Arindam Chakraborty

Abstract:

A crucial objective in quantum chemistry is the computation of the energy levels of chemical systems. This task requires electron-repulsion integrals as inputs and the steep computational cost of evaluating these integrals poses a major numerical challenge in efficient implementation of quantum chemical software. This work presents a moment-based machine learning approach for the efficient evaluation of electron-repulsion integrals. These integrals were approximated using linear combinations of a small number of moments. Machine learning algorithms were applied to estimate the coefficients in the linear combination. A random forest approach was used to identify promising features using a recursive feature elimination approach, which performed best for learning the sign of each coefficient, but not the magnitude. A neural network with two hidden layers was then used to learn the coefficient magnitudes, along with an iterative feature masking approach to perform input vector compression, identifying a small subset of orbitals whose coefficients are sufficient for the quantum state energy computation. Finally, a small ensemble of neural networks (with a median rule for decision fusion) was shown to improve results when compared to a single network.

Keywords: Quantum energy calculations, atomic orbitals, electron-repulsion integrals, ensemble machine learning, random forests, neural networks, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 59
1247 Feature Point Reduction for Video Stabilization

Authors: Theerawat Songyot, Tham Manjing, Bunyarit Uyyanonvara, Chanjira Sinthanayothin

Abstract:

Corner detection and optical flow are common techniques for feature-based video stabilization. However, these algorithms are computationally expensive and should be performed at a reasonable rate. This paper presents an algorithm for discarding irrelevant feature points and maintaining them for future use so as to improve the computational cost. The algorithm starts by initializing a maintained set. The feature points in the maintained set are examined against its accuracy for modeling. Corner detection is required only when the feature points are insufficiently accurate for future modeling. Then, optical flows are computed from the maintained feature points toward the consecutive frame. After that, a motion model is estimated based on the simplified affine motion model and least square method, with outliers belonging to moving objects presented. Studentized residuals are used to eliminate such outliers. The model estimation and elimination processes repeat until no more outliers are identified. Finally, the entire algorithm repeats along the video sequence with the points remaining from the previous iteration used as the maintained set. As a practical application, an efficient video stabilization can be achieved by exploiting the computed motion models. Our study shows that the number of times corner detection needs to perform is greatly reduced, thus significantly improving the computational cost. Moreover, optical flow vectors are computed for only the maintained feature points, not for outliers, thus also reducing the computational cost. In addition, the feature points after reduction can sufficiently be used for background objects tracking as demonstrated in the simple video stabilizer based on our proposed algorithm.

Keywords: background object tracking, feature point reduction, low cost tracking, video stabilization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721
1246 Javanese Character Recognition Using Hidden Markov Model

Authors: Anastasia Rita Widiarti, Phalita Nari Wastu

Abstract:

Hidden Markov Model (HMM) is a stochastic method which has been used in various signal processing and character recognition. This study proposes to use HMM to recognize Javanese characters from a number of different handwritings, whereby HMM is used to optimize the number of state and feature extraction. An 85.7 % accuracy is obtained as the best result in 16-stated vertical model using pure HMM. This initial result is satisfactory for prompting further research.

Keywords: Character recognition, off-line handwritingrecognition, Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1944
1245 A Speeded up Robust Scale-Invariant Feature Transform Currency Recognition Algorithm

Authors: Daliyah S. Aljutaili, Redna A. Almutlaq, Suha A. Alharbi, Dina M. Ibrahim

Abstract:

All currencies around the world look very different from each other. For instance, the size, color, and pattern of the paper are different. With the development of modern banking services, automatic methods for paper currency recognition become important in many applications like vending machines. One of the currency recognition architecture’s phases is Feature detection and description. There are many algorithms that are used for this phase, but they still have some disadvantages. This paper proposes a feature detection algorithm, which merges the advantages given in the current SIFT and SURF algorithms, which we call, Speeded up Robust Scale-Invariant Feature Transform (SR-SIFT) algorithm. Our proposed SR-SIFT algorithm overcomes the problems of both the SIFT and SURF algorithms. The proposed algorithm aims to speed up the SIFT feature detection algorithm and keep it robust. Simulation results demonstrate that the proposed SR-SIFT algorithm decreases the average response time, especially in small and minimum number of best key points, increases the distribution of the number of best key points on the surface of the currency. Furthermore, the proposed algorithm increases the accuracy of the true best point distribution inside the currency edge than the other two algorithms.

Keywords: Currency recognition, feature detection and description, SIFT algorithm, SURF algorithm, speeded up and robust features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 813
1244 Discovering Complex Regularities by Adaptive Self Organizing Classification

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518
1243 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 109
1242 Response Surface Modeling of Lactic Acid Extraction by Emulsion Liquid Membrane: Box-Behnken Experimental Design

Authors: A. Thakur, P. S. Panesar, M. S. Saini

Abstract:

Extraction of lactic acid by emulsion liquid membrane technology (ELM) using n-trioctyl amine (TOA) in n-heptane as carrier within the organic membrane along with sodium carbonate as acceptor phase was optimized by using response surface methodology (RSM). A three level Box-Behnken design was employed for experimental design, analysis of the results and to depict the combined effect of five independent variables, vizlactic acid concentration in aqueous phase (cl), sodium carbonate concentration in stripping phase (cs), carrier concentration in membrane phase (ψ), treat ratio, and batch extraction time (τ)  with equal volume of organic and external aqueous phase on lactic acid extraction efficiency. The maximum lactic acid extraction efficiency (ηext) of 98.21%from aqueous phase in a batch reactor using ELM was found at the optimized values for test variables, cl, cs, ψ, and τ as 0.06 [M], 0.18 [M], 4.72 (%,v/v), 1.98 (v/v) and 13.36 min respectively. 

Keywords: Emulsion liquid membrane, extraction, lactic acid, n-trioctylamine, response surface methodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2286
1241 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3004
1240 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts

Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah

Abstract:

Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.

Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2199
1239 3D Model Retrieval based on Normal Vector Interpolation Method

Authors: Ami Kim, Oubong Gwun, Juwhan Song

Abstract:

In this paper, we proposed the distribution of mesh normal vector direction as a feature descriptor of a 3D model. A normal vector shows the entire shape of a model well. The distribution of normal vectors was sampled in proportion to each polygon's area so that the information on the surface with less surface area may be less reflected on composing a feature descriptor in order to enhance retrieval performance. At the analysis result of ANMRR, the enhancement of approx. 12.4%~34.7% compared to the existing method has also been indicated.

Keywords: Interpolated Normal Vector, Feature Descriptor, 3DModel Retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
1238 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3165
1237 Hierarchical PSO-Adaboost Based Classifiers for Fast and Robust Face Detection

Authors: Hong Pan, Yaping Zhu, Liang Zheng Xia

Abstract:

We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.

Keywords: Adaboost, Face detection, Feature selection, PSO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2144