Search results for: Multi class Classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3495

Search results for: Multi class Classification

3465 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1261
3464 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode provides good sources of needed information to classify living species. The classification problem has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use the similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. However, all the used methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. In fact, our method permits to avoid the complex problem of form and structure in different classes of organisms. The empirical data and their classification performances are compared with other methods. Evenly, in this study, we present our system which is consisted of three phases. The first one, is called transformation, is composed of three sub steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. Moreover, the second phase step is an approximation; it is empowered by the use of Multi Library Wavelet Neural Networks (MLWNN). Finally, the third one, is called the classification of DNA Barcodes, is realized by applying the algorithm of hierarchical classification.

Keywords: DNA Barcode, Electron-Ion Interaction Pseudopotential, Multi Library Wavelet Neural Networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
3463 Genetic Folding: Analyzing the Mercer-s Kernels Effect in Support Vector Machine using Genetic Folding

Authors: Mohd A. Mezher, Maysam F. Abbod

Abstract:

Genetic Folding (GF) a new class of EA named as is introduced for the first time. It is based on chromosomes composed of floating genes structurally organized in a parent form and separated by dots. Although, the genotype/phenotype system of GF generates a kernel expression, which is the objective function of superior classifier. In this work the question of the satisfying mapping-s rules in evolving populations is addressed by analyzing populations undergoing either Mercer-s or none Mercer-s rule. The results presented here show that populations undergoing Mercer-s rules improve practically models selection of Support Vector Machine (SVM). The experiment is trained multi-classification problem and tested on nonlinear Ionosphere dataset. The target of this paper is to answer the question of evolving Mercer-s rule in SVM addressed using either genetic folding satisfied kernel-s rules or not applied to complicated domains and problems.

Keywords: Genetic Folding, GF, Evolutionary Algorithms, Support Vector Machine, Genetic Algorithm, Genetic Programming, Multi-Classification, Mercer's Rules

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
3462 An Improvement of Multi-Label Image Classification Method Based on Histogram of Oriented Gradient

Authors: Ziad Abdallah, Mohamad Oueidat, Ali El-Zaart

Abstract:

Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.

Keywords: Data mining, information retrieval system, multi-label, problem transformation, histogram of gradients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268
3461 Mapping Paddy Rice Agriculture using Multi-temporal FORMOSAT-2 Images

Authors: Yi-Shiang Shiu, Meng-Lung Lin, Kang-Tsung Chang, Tzu-How Chu

Abstract:

Most paddy rice fields in East Asia are small parcels, and the weather conditions during the growing season are usually cloudy. FORMOSAT-2 multi-spectral images have an 8-meter resolution and one-day recurrence, ideal for mapping paddy rice fields in East Asia. To map rice fields, this study first determined the transplanting and the most active tillering stages of paddy rice and then used multi-temporal images to distinguish different growing characteristics between paddy rice and other ground covers. The unsupervised ISODATA (iterative self-organizing data analysis techniques) and supervised maximum likelihood were both used to discriminate paddy rice fields, with training areas automatically derived from ten-year cultivation parcels in Taiwan. Besides original bands in multi-spectral images, we also generated normalized difference vegetation index and experimented with object-based pre-classification and post-classification. This paper discusses results of different image classification methods in an attempt to find a precise and automatic solution to mapping paddy rice in Taiwan.

Keywords: paddy rice fields; multi-temporal; FORMOSAT-2images, normalized difference vegetation index, object-basedclassification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
3460 Lithofacies Classification from Well Log Data Using Neural Networks, Interval Neutrosophic Sets and Quantification of Uncertainty

Authors: Pawalai Kraipeerapun, Chun Che Fung, Kok Wai Wong

Abstract:

This paper proposes a novel approach to the question of lithofacies classification based on an assessment of the uncertainty in the classification results. The proposed approach has multiple neural networks (NN), and interval neutrosophic sets (INS) are used to classify the input well log data into outputs of multiple classes of lithofacies. A pair of n-class neural networks are used to predict n-degree of truth memberships and n-degree of false memberships. Indeterminacy memberships or uncertainties in the predictions are estimated using a multidimensional interpolation method. These three memberships form the INS used to support the confidence in results of multiclass classification. Based on the experimental data, our approach improves the classification performance as compared to an existing technique applied only to the truth membership. In addition, our approach has the capability to provide a measure of uncertainty in the problem of multiclass classification.

Keywords: Multiclass classification, feed-forward backpropagation neural network, interval neutrosophic sets, uncertainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595
3459 Rock Textures Classification Based on Textural and Spectral Features

Authors: Tossaporn Kachanubal, Somkait Udomhunsakul

Abstract:

In this paper, we proposed a method to classify each type of natural rock texture. Our goal is to classify 26 classes of rock textures. First, we extract five features of each class by using principle component analysis combining with the use of applied spatial frequency measurement. Next, the effective node number of neural network was tested. We used the most effective neural network in classification process. The results from this system yield quite high in recognition rate. It is shown that high recognition rate can be achieved in separation of 26 stone classes.

Keywords: Texture classification, SFM, neural network, rock texture classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
3458 A Content Vector Model for Text Classification

Authors: Eric Jiang

Abstract:

As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications. In this paper, an LSI-based content vector model for text classification is presented, which constructs multiple augmented category LSI spaces and classifies text by their content. The model integrates the class discriminative information from the training data and is equipped with several pertinent feature selection and text classification algorithms. The proposed classifier has been applied to email classification and its experiments on a benchmark spam testing corpus (PU1) have shown that the approach represents a competitive alternative to other email classifiers based on the well-known SVM and naïve Bayes algorithms.

Keywords: Feature Selection, Latent Semantic Indexing, Text Classification, Vector Space Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1844
3457 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1965
3456 Feature Extraction for Surface Classification – An Approach with Wavelets

Authors: Smriti H. Bhandari, S. M. Deshpande

Abstract:

Surface metrology with image processing is a challenging task having wide applications in industry. Surface roughness can be evaluated using texture classification approach. Important aspect here is appropriate selection of features that characterize the surface. We propose an effective combination of features for multi-scale and multi-directional analysis of engineering surfaces. The features include standard deviation, kurtosis and the Canny edge detector. We apply the method by analyzing the surfaces with Discrete Wavelet Transform (DWT) and Dual-Tree Complex Wavelet Transform (DT-CWT). We used Canberra distance metric for similarity comparison between the surface classes. Our database includes the surface textures manufactured by three machining processes namely Milling, Casting and Shaping. The comparative study shows that DT-CWT outperforms DWT giving correct classification performance of 91.27% with Canberra distance metric.

Keywords: Dual-tree complex wavelet transform, surface metrology, surface roughness, texture classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2201
3455 Autonomously Determining the Parameters for SVDD with RBF Kernel from a One-Class Training Set

Authors: Andreas Theissler, Ian Dear

Abstract:

The one-class support vector machine “support vector data description” (SVDD) is an ideal approach for anomaly or outlier detection. However, for the applicability of SVDD in real-world applications, the ease of use is crucial. The results of SVDD are massively determined by the choice of the regularisation parameter C and the kernel parameter  of the widely used RBF kernel. While for two-class SVMs the parameters can be tuned using cross-validation based on the confusion matrix, for a one-class SVM this is not possible, because only true positives and false negatives can occur during training. This paper proposes an approach to find the optimal set of parameters for SVDD solely based on a training set from one class and without any user parameterisation. Results on artificial and real data sets are presented, underpinning the usefulness of the approach.

Keywords: Support vector data description, anomaly detection, one-class classification, parameter tuning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2887
3454 Performance Analysis of Artificial Neural Network Based Land Cover Classification

Authors: Najam Aziz, Nasru Minallah, Ahmad Junaid, Kashaf Gul

Abstract:

Landcover classification using automated classification techniques, while employing remotely sensed multi-spectral imagery, is one of the promising areas of research. Different land conditions at different time are captured through satellite and monitored by applying different classification algorithms in specific environment. In this paper, a SPOT-5 image provided by SUPARCO has been studied and classified in Environment for Visual Interpretation (ENVI), a tool widely used in remote sensing. Then, Artificial Neural Network (ANN) classification technique is used to detect the land cover changes in Abbottabad district. Obtained results are compared with a pixel based Distance classifier. The results show that ANN gives the better overall accuracy of 99.20% and Kappa coefficient value of 0.98 over the Mahalanobis Distance Classifier.

Keywords: Landcover classification, artificial neural network, remote sensing, SPOT-5.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1545
3453 Teaching Approach and Self-Confidence Effect Model Consistency between Taiwan and Singapore Multi-Group HLM

Authors: PeiWen Liao, Tsung Hau Jen

Abstract:

This study was conducted to explore the effects of two countries model comparison program in Taiwan and Singapore in TIMSS database. The researchers used Multi-Group Hierarchical Linear Modeling techniques to compare the effects of two different country models and we tested our hypotheses on 4,046 Taiwan students and 4,599 Singapore students in 2007 at two levels: the class level and student (individual) level. Design quality is a class level variable. Student level variables are achievement and self-confidence. The results challenge the widely held view that retention has a positive impact on self-confidence. Suggestions for future research are discussed.

Keywords: Teaching approach, self-confidence, achievement, multi-group HLM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1796
3452 Using Genetic Programming to Evolve a Team of Data Classifiers

Authors: Gregor A. Morrison, Dominic P. Searson, Mark J. Willis

Abstract:

The purpose of this paper is to demonstrate the ability of a genetic programming (GP) algorithm to evolve a team of data classification models. The GP algorithm used in this work is “multigene" in nature, i.e. there are multiple tree structures (genes) that are used to represent team members. Each team member assigns a data sample to one of a fixed set of output classes. A majority vote, determined using the mode (highest occurrence) of classes predicted by the individual genes, is used to determine the final class prediction. The algorithm is tested on a binary classification problem. For the case study investigated, compact classification models are obtained with comparable accuracy to alternative approaches.

Keywords: classification, genetic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741
3451 Using PFA in Feature Analysis and Selection for H.264 Adaptation

Authors: Nora A. Naguib, Ahmed E. Hussein, Hesham A. Keshk, Mohamed I. El-Adawy

Abstract:

Classification of video sequences based on their contents is a vital process for adaptation techniques. It helps decide which adaptation technique best fits the resource reduction requested by the client. In this paper we used the principal feature analysis algorithm to select a reduced subset of video features. The main idea is to select only one feature from each class based on the similarities between the features within that class. Our results showed that using this feature reduction technique the source video features can be completely omitted from future classification of video sequences.

Keywords: Adaptation, feature selection, H.264, Principal Feature Analysis (PFA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
3450 Classification Influence Index and its Application for k-Nearest Neighbor Classifier

Authors: Sejong Oh

Abstract:

Classification is an important topic in machine learning and bioinformatics. Many datasets have been introduced for classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset. The power of classification for each feature differs. In this study, we suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement of the classification accuracy.

Keywords: accuracy, classification, dataset, data preprocessing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
3449 Investigation of Wave Atom Sub-Bands via Breast Cancer Classification

Authors: Nebi Gedik, Ayten Atasoy

Abstract:

This paper investigates successful sub-bands of wave atom transform via classification of mammograms, when the coefficients of sub-bands are used as features. A computer-aided diagnosis system is constructed by using wave atom transform, support vector machine and k-nearest neighbor classifiers. Two-class classification is studied in detail using two data sets, separately. The successful sub-bands are determined according to the accuracy rates, coefficient numbers, and sensitivity rates.

Keywords: Breast cancer, wave atom transform, SVM, k-NN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1019
3448 Suspended Matter Model on Alsat-1 Image by MLP Network and Mathematical Morphology: Prototypes by K-Means

Authors: S. Loumi, H. Merrad, F. Alilat, B. Sansal

Abstract:

In this article, we propose a methodology for the characterization of the suspended matter along Algiers-s bay. An approach by multi layers perceptron (MLP) with training by back propagation of the gradient optimized by the algorithm of Levenberg Marquardt (LM) is used. The accent was put on the choice of the components of the base of training where a comparative study made for four methods: Random and three alternatives of classification by K-Means. The samples are taken from suspended matter image, obtained by analytical model based on polynomial regression by taking account of in situ measurements. The mask which selects the zone of interest (water in our case) was carried out by using a multi spectral classification by ISODATA algorithm. To improve the result of classification, a cleaning of this mask was carried out using the tools of mathematical morphology. The results of this study presented in the forms of curves, tables and of images show the founded good of our methodology.

Keywords: Classification K-means, mathematical morphology, neural network MLP, remote sensing, suspended particulate matter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483
3447 A Variable Structure MRAC for a Class of MIMO Systems

Authors: Ardeshir Karami Mohammadi

Abstract:

A Variable Structure Model Reference Adaptive Controller using state variables is proposed for a class of multi input-multi output systems. Adaptation law is of variable structure type and switching functions is designed based on stability requirements. Global exponential stability is proved based on Lyapunov criterion. Transient behavior is analyzed using sliding mode control and shows perfect model following at a finite time.

Keywords: Adaptive control, Model reference, Variablestructure, MIMO system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
3446 Genetic Programming Approach for Multi-Category Pattern Classification Appliedto Network Intrusions Detection

Authors: K.M. Faraoun, A. Boukelif

Abstract:

This paper describes a new approach of classification using genetic programming. The proposed technique consists of genetically coevolving a population of non-linear transformations on the input data to be classified, and map them to a new space with a reduced dimension, in order to get a maximum inter-classes discrimination. The classification of new samples is then performed on the transformed data, and so become much easier. Contrary to the existing GP-classification techniques, the proposed one use a dynamic repartition of the transformed data in separated intervals, the efficacy of a given intervals repartition is handled by the fitness criterion, with a maximum classes discrimination. Experiments were first performed using the Fisher-s Iris dataset, and then, the KDD-99 Cup dataset was used to study the intrusion detection and classification problem. Obtained results demonstrate that the proposed genetic approach outperform the existing GP-classification methods [1],[2] and [3], and give a very accepted results compared to other existing techniques proposed in [4],[5],[6],[7] and [8].

Keywords: Genetic programming, patterns classification, intrusion detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
3445 An Attribute-Centre Based Decision Tree Classification Algorithm

Authors: Gökhan Silahtaroğlu

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
3444 Learning of Class Membership Values by Ellipsoidal Decision Regions

Authors: Leehter Yao, Chin-Chin Lin

Abstract:

A novel method of learning complex fuzzy decision regions in the n-dimensional feature space is proposed. Through the fuzzy decision regions, a given pattern's class membership value of every class is determined instead of the conventional crisp class the pattern belongs to. The n-dimensional fuzzy decision region is approximated by union of hyperellipsoids. By explicitly parameterizing these hyperellipsoids, the decision regions are determined by estimating the parameters of each hyperellipsoid.Genetic Algorithm is applied to estimate the parameters of each region component. With the global optimization ability of GA, the learned decision region can be arbitrarily complex.

Keywords: Ellipsoid, genetic algorithm, decision regions, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
3443 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfil requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: Single classifier, machine learning, ensemble learning, multi-sensor target tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 538
3442 A New Classification of Risk-Reduction Options to Improve the Risk-Reduction Readiness of the Railway Industry

Authors: Eberechi Weli, Michael Todinov

Abstract:

The gap between the selection of risk-reduction options in the railway industry and the task of their effective implementation results in compromised safety and substantial losses. An effective risk management must necessarily integrate the evaluation phases with the implementation phase. This paper proposes an essential categorisation of risk reduction measures that best addresses a standard railway industry portfolio. By categorising the risk reduction options into design, operational, procedural and technical options, it is guaranteed that the efforts of the implementation facilitators (people, processes and supporting systems) are systematically harmonised. The classification is based on an integration of fundamental principles of risk reduction in the railway industry with the systems engineering approach.

This paper argues that the use of a similar classification approach is an attribute of organisations possessing a superior level of risk-reduction readiness. The integration of the proposed rational classification structure provides a solid ground for effective risk reduction.

Keywords: Cost effectiveness, organisational readiness, risk reduction, railway, system engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760
3441 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 719
3440 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure

Authors: Rozilawati Binti Dollah, Masaki Aono

Abstract:

Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.

Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974
3439 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1255
3438 Empirical Exploration for the Correlation between Class Object-Oriented Connectivity-Based Cohesion and Coupling

Authors: Jehad Al Dallal

Abstract:

Attributes and methods are the basic contents of an object-oriented class. The connectivity among these class members and the relationship between the class and other classes play an important role in determining the quality of an object-oriented system. Class cohesion evaluates the degree of relatedness of class attributes and methods, whereas class coupling refers to the degree to which a class is related to other classes. Researchers have proposed several class cohesion and class coupling measures. However, the correlation between class coupling and class cohesion measures has not been thoroughly studied. In this paper, using classes of three open-source Java systems, we empirically investigate the correlation between several measures of connectivity-based class cohesion and coupling. Four connectivity-based cohesion measures and eight coupling measures are considered in the empirical study. The empirical study results show that class connectivity-based cohesion and coupling internal quality attributes are inversely correlated. The strength of the correlation depends highly on the cohesion and coupling measurement approaches.

Keywords: Object-oriented class, software quality, class cohesion measure, class coupling measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
3437 Using Self Organizing Feature Maps for Classification in RGB Images

Authors: Hassan Masoumi, Ahad Salimi, Nazanin Barhemmat, Babak Gholami

Abstract:

Artificial neural networks have gained a lot of interest as empirical models for their powerful representational capacity, multi input and output mapping characteristics. In fact, most feedforward networks with nonlinear nodal functions have been proved to be universal approximates. In this paper, we propose a new supervised method for color image classification based on selforganizing feature maps (SOFM). This algorithm is based on competitive learning. The method partitions the input space using self-organizing feature maps to introduce the concept of local neighborhoods. Our image classification system entered into RGB image. Experiments with simulated data showed that separability of classes increased when increasing training time. In additional, the result shows proposed algorithms are effective for color image classification.

Keywords: Classification, SOFM, neural network, RGB images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2275
3436 Night-Time Traffic Light Detection Based On SVM with Geometric Moment Features

Authors: Hyun-Koo Kim, Young-Nam Shin, Sa-gong Kuk, Ju H. Park, Ho-Youl Jung

Abstract:

This paper presents an effective traffic lights detection method at the night-time. First, candidate blobs of traffic lights are extracted from RGB color image. Input image is represented on the dominant color domain by using color transform proposed by Ruta, then red and green color dominant regions are selected as candidates. After candidate blob selection, we carry out shape filter for noise reduction using information of blobs such as length, area, area of boundary box, etc. A multi-class classifier based on SVM (Support Vector Machine) applies into the candidates. Three kinds of features are used. We use basic features such as blob width, height, center coordinate, area, area of blob. Bright based stochastic features are also used. In particular, geometric based moment-s values between candidate region and adjacent region are proposed and used to improve the detection performance. The proposed system is implemented on Intel Core CPU with 2.80 GHz and 4 GB RAM and tested with the urban and rural road videos. Through the test, we show that the proposed method using PF, BMF, and GMF reaches up to 93 % of detection rate with computation time of in average 15 ms/frame.

Keywords: Night-time traffic light detection, multi-class classification, driving assistance system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3826