Search results for: naïve Bayesian classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1251

Search results for: naïve Bayesian classification

681 Establishment of Air Quality Zones in Italy

Authors: M. G. Dirodi, G. Gugliotta, C. Leonardi

Abstract:

Member States shall establish zones and agglomerations throughout their territory to assess and manage air quality in order to comply with European directives. In Italy decree 155/2010, transposing Directive 2008/50/EC on ambient air quality and cleaner air for Europe, merged into a single act the previous provisions on ambient air quality assessment and management, including those resulting from the implementation of Directive 2004/107/EC relating to arsenic, cadmium, nickel, mercury and polycyclic aromatic hydrocarbons in ambient air. Decree 155/2010 introduced stricter rules for identifying zones on the basis of the characteristics of the territory in spite of considering pollution levels, as it was in the past. The implementation of such new criteria has reduced the great variability of the previous zoning, leading to a significant reduction of the total number of zones and to a complete and uniform ambient air quality assessment and management throughout the Country. The present document is related to the new zones definition in Italy according to Decree 155/2010. In particular the paper contains the description and the analysis of the outcome of zoning and classification.

Keywords: Zones, agglomerations, air quality assessment, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117
680 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: Time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547
679 An Intelligent Human-Computer Interaction System for Decision Support

Authors: Chee Siong Teh, Chee Peng Lim

Abstract:

This paper proposes a novel architecture for developing decision support systems. Unlike conventional decision support systems, the proposed architecture endeavors to reveal the decision-making process such that humans' subjectivity can be incorporated into a computerized system and, at the same time, to preserve the capability of the computerized system in processing information objectively. A number of techniques used in developing the decision support system are elaborated to make the decisionmarking process transparent. These include procedures for high dimensional data visualization, pattern classification, prediction, and evolutionary computational search. An artificial data set is first employed to compare the proposed approach with other methods. A simulated handwritten data set and a real data set on liver disease diagnosis are then employed to evaluate the efficacy of the proposed approach. The results are analyzed and discussed. The potentials of the proposed architecture as a useful decision support system are demonstrated.

Keywords: Interactive evolutionary computation, multivariate data projection, pattern classification, topographic map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1439
678 Multiple Targets Classification and Fuzzy Logic Decision Fusion in Wireless Sensor Networks

Authors: Ahmad Aljaafreh

Abstract:

This paper proposes a hierarchical hidden Markov model (HHMM) to model the detection of M vehicles in a wireless sensor network (WSN). The HHMM model contains an extra level of hidden Markov model to model the temporal transitions of each state of the first HMM. By modeling the temporal transitions, only those hypothesis with nonzero transition probabilities needs to be tested. Thus, this method efficiently reduces the computation load, which is preferable in WSN applications.This paper integrates several techniques to optimize the detection performance. The output of the states of the first HMM is modeled as Gaussian Mixture Model (GMM), where the number of states and the number of Gaussians are experimentally determined, while the other parameters are estimated using Expectation Maximization (EM). HHMM is used to model the sequence of the local decisions which are based on multiple hypothesis testing with maximum likelihood approach. The states in the HHMM represent various combinations of vehicles of different types. Due to the statistical advantages of multisensor data fusion, we propose a heuristic based on fuzzy weighted majority voting to enhance cooperative classification of moving vehicles within a region that is monitored by a wireless sensor network. A fuzzy inference system weighs each local decision based on the signal to noise ratio of the acoustic signal for target detection and the signal to noise ratio of the radio signal for sensor communication. The spatial correlation among the observations of neighboring sensor nodes is efficiently utilized as well as the temporal correlation. Simulation results demonstrate the efficiency of this scheme.

Keywords: Classification, decision fusion, fuzzy logic, hidden Markov model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6231
677 Justification and Classification of Issues for the Selection and Implementation of Advanced Manufacturing Technologies

Authors: Zahra Banakar, Farzad Tahriri

Abstract:

It has often been said that the strength of any country resides in the strength of its industrial sector, and Progress in industrial society has been accomplished by the creation of new technologies. Developments have been facilitated by the increasing availability of advanced manufacturing technology (AMT), in addition the implementation of advanced manufacturing technology (AMT) requires careful planning at all levels of the organization to ensure that the implementation will achieve the intended goals. Justification and implementation of advanced manufacturing technology (AMT) involves decisions that are crucial for the practitioners regarding the survival of business in the present days of uncertain manufacturing world. This paper assists the industrial managers to consider all the important criteria for success AMT implementation, when purchasing new technology. Concurrently, this paper classifies the tangible benefits of a technology that are evaluated by addressing both cost and time dimensions, and the intangible benefits are evaluated by addressing technological, strategic, social and human issues to identify and create awareness of the essential elements in the AMT implementation process and identify the necessary actions before implementing AMT.

Keywords: Advanced Manufacturing Technology (AMT), Justification and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2513
676 The Integrated Management of Health Care Strategies and Differential Diagnosis by Expert System Technology: A Single-Dimensional Approach

Authors: A. B. Adehor, P. R. Burrell

Abstract:

The Integrated Management of Child illnesses (IMCI) and the surveillance Health Information Systems (HIS) are related strategies that are designed to manage child illnesses and community practices of diseases. However, both strategies do not function well together because of classification incompatibilities and, as such, are difficult to use by health care personnel in rural areas where a majority of people lack the basic knowledge of interpreting disease classification from these methods. This paper discusses a single approach on how a stand-alone expert system can be used as a prompt diagnostic tool for all cases of illnesses presented. The system combines the action-oriented IMCI and the disease-oriented HIS approaches to diagnose malaria and typhoid fever in the rural areas of the Niger-delta region.

Keywords: Differential diagnosis, Health Information System(HIS), Integrated Management of Child Illnesses (IMCI), Malaria andTyphoid fever.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1850
675 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfil requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: Single classifier, machine learning, ensemble learning, multi-sensor target tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 575
674 Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

Authors: Mohamed Ali KAMMOUN, Ahmed Ben HAMIDA

Abstract:

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Keywords: Unit selection, Corpus-based Speech Synthesis, Bigram model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421
673 Early-Warning Lights Classification Management System for Industrial Parks in Taiwan

Authors: Yu-Min Chang, Kuo-Sheng Tsai, Hung-Te Tsai, Chia-Hsin Li

Abstract:

This paper presents the early-warning lights classification management system for industrial parks promoted by the Taiwan Environmental Protection Administration (EPA) since 2011, including the definition of each early-warning light, objectives, action program and accomplishments. All of the 151 industrial parks in Taiwan were classified into four early-warning lights, including red, orange, yellow and green, for carrying out respective pollution management according to the monitoring data of soil and groundwater quality, regulatory compliance, and regulatory listing of control site or remediation site. The Taiwan EPA set up a priority list for high potential polluted industrial parks and investigated their soil and groundwater qualities based on the results of the light classification and pollution potential assessment. In 2011-2013, there were 44 industrial parks selected and carried out different investigation, such as the early warning groundwater well networks establishment and pollution investigation/verification for the red and orange-light industrial parks and the environmental background survey for the yellow-light industrial parks. Among them, 22 industrial parks were newly or continuously confirmed that the concentrations of pollutants exceeded those in soil or groundwater pollution control standards. Thus, the further investigation, groundwater use restriction, listing of pollution control site or remediation site, and pollutant isolation measures were implemented by the local environmental protection and industry competent authorities; the early warning lights of those industrial parks were proposed to adjust up to orange or red-light. Up to the present, the preliminary positive effect of the soil and groundwater quality management system for industrial parks has been noticed in several aspects, such as environmental background information collection, early warning of pollution risk, pollution investigation and control, information integration and application, and inter-agency collaboration. Finally, the work and goal of self-initiated quality management of industrial parks will be carried out on the basis of the inter-agency collaboration by the classified lights system of early warning and management as well as the regular announcement of the status of each industrial park.

Keywords: Industrial park, soil and groundwater quality management, early-warning lights classification, SOP for reporting and treatment of monitored abnormal events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1973
672 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304
671 Real-time Laser Monitoring based on Pipe Detective Operation

Authors: Mongkorn Klingajay, Tawatchai Jitson

Abstract:

The pipe inspection operation is the difficult detective performance. Almost applications are mainly relies on a manual recognition of defective areas that have carried out detection by an engineer. Therefore, an automation process task becomes a necessary in order to avoid the cost incurred in such a manual process. An automated monitoring method to obtain a complete picture of the sewer condition is proposed in this work. The focus of the research is the automated identification and classification of discontinuities in the internal surface of the pipe. The methodology consists of several processing stages including image segmentation into the potential defect regions and geometrical characteristic features. Automatic recognition and classification of pipe defects are carried out by means of using an artificial neural network technique (ANN) based on Radial Basic Function (RBF). Experiments in a realistic environment have been conducted and results are presented.

Keywords: Artificial neural network, Radial basic function, Curve fitting, CCTV, Image segmentation, Data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797
670 Functional Near Infrared Spectroscope for Cognition Brain Tasks by Wavelets Analysis and Neural Networks

Authors: Truong Quang Dang Khoa, Masahiro Nakagawa

Abstract:

Brain Computer Interface (BCI) has been recently increased in research. Functional Near Infrared Spectroscope (fNIRs) is one the latest technologies which utilize light in the near-infrared range to determine brain activities. Because near infrared technology allows design of safe, portable, wearable, non-invasive and wireless qualities monitoring systems, fNIRs monitoring of brain hemodynamics can be value in helping to understand brain tasks. In this paper, we present results of fNIRs signal analysis indicating that there exist distinct patterns of hemodynamic responses which recognize brain tasks toward developing a BCI. We applied two different mathematics tools separately, Wavelets analysis for preprocessing as signal filters and feature extractions and Neural networks for cognition brain tasks as a classification module. We also discuss and compare with other methods while our proposals perform better with an average accuracy of 99.9% for classification.

Keywords: functional near infrared spectroscope (fNIRs), braincomputer interface (BCI), wavelets, neural networks, brain activity, neuroimaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2015
669 Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Authors: T. Kumaresan, S. Sanjushree, C. Palanisamy

Abstract:

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Keywords: File Type, HSV Histogram, k-NN, RGB Histogram, Spam Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2121
668 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: Molluca Collision Zone, partition regions, conventional statistical methods, Earthquakes, classifications, disaster management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
667 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: Brain-computer interface, BCI, electroencephalography, EEG, finger motion decoding, independent component analysis, pseudo-real-time motion decoding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 576
666 Machine Learning Methods for Flood Hazard Mapping

Authors: S. Zappacosta, C. Bove, M. Carmela Marinelli, P. di Lauro, K. Spasenovic, L. Ostano, G. Aiello, M. Pietrosanto

Abstract:

This paper proposes a neural network approach for assessing flood hazard mapping. The core of the model is a machine learning component fed by frequency ratios, namely statistical correlations between flood event occurrences and a selected number of topographic properties. The classification capability was compared with the flood hazard mapping River Basin Plans (Piani Assetto Idrogeologico, acronimed as PAI) designed by the Italian Institute for Environmental Research and Defence, ISPRA (Istituto Superiore per la Protezione e la Ricerca Ambientale), encoding four different increasing flood hazard levels. The study area of Piemonte, an Italian region, has been considered without loss of generality. The frequency ratios may be used as a standalone block to model the flood hazard mapping. Nevertheless, the mixture with a neural network improves the classification power of several percentage points, and may be proposed as a basic tool to model the flood hazard map in a wider scope.

Keywords: flood modeling, hazard map, neural networks, hydrogeological risk, flood risk assessment

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 689
665 Spatial Data Mining by Decision Trees

Authors: S. Oujdi, H. Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2965
664 Integrating Security Indifference Curve to Formal Decision Evaluation

Authors: Anon Yantarasri, Yachai Limpiyakorn

Abstract:

Decisions are regularly made during a project or daily life. Some decisions are critical and have a direct impact on project or human success. Formal evaluation is thus required, especially for crucial decisions, to arrive at the optimal solution among alternatives to address issues. According to microeconomic theory, all people-s decisions can be modeled as indifference curves. The proposed approach supports formal analysis and decision by constructing indifference curve model from the previous experts- decision criteria. These knowledge embedded in the system can be reused or help naïve users select alternative solution of the similar problem. Moreover, the method is flexible to cope with unlimited number of factors influencing the decision-making. The preliminary experimental results of the alternative selection are accurately matched with the expert-s decisions.

Keywords: Decision Analysis and Resolution, Indifference Curve, Multi-criteria Decision Making.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607
663 Optimizing Mobile Agents Migration Based on Decision Tree Learning

Authors: Yasser k. Ali, Hesham N. Elmahdy, Sanaa El Olla Hanfy Ahmed

Abstract:

Mobile agents are a powerful approach to develop distributed systems since they migrate to hosts on which they have the resources to execute individual tasks. In a dynamic environment like a peer-to-peer network, Agents have to be generated frequently and dispatched to the network. Thus they will certainly consume a certain amount of bandwidth of each link in the network if there are too many agents migration through one or several links at the same time, they will introduce too much transferring overhead to the links eventually, these links will be busy and indirectly block the network traffic, therefore, there is a need of developing routing algorithms that consider about traffic load. In this paper we seek to create cooperation between a probabilistic manner according to the quality measure of the network traffic situation and the agent's migration decision making to the next hop based on decision tree learning algorithms.

Keywords: Agent Migration, Decision Tree learning, ID3 algorithm, Naive Bayes Classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1978
662 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 505
661 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: Convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1384
660 Dynamic Time Warping in Gait Classificationof Motion Capture Data

Authors: Adam Świtoński, Agnieszka Michalczuk, Henryk Josiński, Andrzej Polański, KonradWojciechowski

Abstract:

The method of gait identification based on the nearest neighbor classification technique with motion similarity assessment by the dynamic time warping is proposed. The model based kinematic motion data, represented by the joints rotations coded by Euler angles and unit quaternions is used. The different pose distance functions in Euler angles and quaternion spaces are considered. To evaluate individual features of the subsequent joints movements during gait cycle, joint selection is carried out. To examine proposed approach database containing 353 gaits of 25 humans collected in motion capture laboratory is used. The obtained results are promising. The classifications, which takes into consideration all joints has accuracy over 91%. Only analysis of movements of hip joints allows to correctly identify gaits with almost 80% precision.

Keywords: Biometrics, dynamic time warping, gait identification, motion capture, time series classification, quaternion distance functions, attribute ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2594
659 Color Image Segmentation Using SVM Pixel Classification Image

Authors: K. Sakthivel, R. Nallusamy, C. Kavitha

Abstract:

The goal of image segmentation is to cluster pixels into salient image regions. Segmentation could be used for object recognition, occlusion boundary estimation within motion or stereo systems, image compression, image editing, or image database lookup. In this paper, we present a color image segmentation using support vector machine (SVM) pixel classification. Firstly, the pixel level color and texture features of the image are extracted and they are used as input to the SVM classifier. These features are extracted using the homogeneity model and Gabor Filter. With the extracted pixel level features, the SVM Classifier is trained by using FCM (Fuzzy C-Means).The image segmentation takes the advantage of both the pixel level information of the image and also the ability of the SVM Classifier. The Experiments show that the proposed method has a very good segmentation result and a better efficiency, increases the quality of the image segmentation compared with the other segmentation methods proposed in the literature.

Keywords: Image Segmentation, Support Vector Machine, Fuzzy C–Means, Pixel Feature, Texture Feature, Homogeneity model, Gabor Filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6723
658 A New Hybrid K-Mean-Quick Reduct Algorithm for Gene Selection

Authors: E. N. Sathishkumar, K. Thangavel, T. Chandrasekhar

Abstract:

Feature selection is a process to select features which are more informative. It is one of the important steps in knowledge discovery. The problem is that all genes are not important in gene expression data. Some of the genes may be redundant, and others may be irrelevant and noisy. Here a novel approach is proposed Hybrid K-Mean-Quick Reduct (KMQR) algorithm for gene selection from gene expression data. In this study, the entire dataset is divided into clusters by applying K-Means algorithm. Each cluster contains similar genes. The high class discriminated genes has been selected based on their degree of dependence by applying Quick Reduct algorithm to all the clusters. Average Correlation Value (ACV) is calculated for the high class discriminated genes. The clusters which have the ACV value as 1 is determined as significant clusters, whose classification accuracy will be equal or high when comparing to the accuracy of the entire dataset. The proposed algorithm is evaluated using WEKA classifiers and compared. The proposed work shows that the high classification accuracy.

Keywords: Clustering, Gene Selection, K-Mean-Quick Reduct, Rough Sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2286
657 Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

Authors: L. Salhi, M. Talbi, A. Cherif

Abstract:

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

Keywords: Formants, Neural Networks, Pathological Voices, Pitch, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2827
656 Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method

Authors: S. Qaedi, S. Seyedtabaii

Abstract:

Non-Destructive evaluation of in-service power transformer condition is necessary for avoiding catastrophic failures. Dissolved Gas Analysis (DGA) is one of the important methods. Traditional, statistical and intelligent DGA approaches have been adopted for accurate classification of incipient fault sources. Unfortunately, there are not often enough faulty patterns required for sufficient training of intelligent systems. By bootstrapping the shortcoming is expected to be alleviated and algorithms with better classification success rates to be obtained. In this paper the performance of an artificial neural network, K-Nearest Neighbour and support vector machine methods using bootstrapped data are detailed and shown that while the success rate of the ANN algorithms improves remarkably, the outcome of the others do not benefit so much from the provided enlarged data space. For assessment, two databases are employed: IEC TC10 and a dataset collected from reported data in papers. High average test success rate well exhibits the remarkable outcome.

Keywords: Dissolved gas analysis, Transformer incipient fault, Artificial Neural Network, Support Vector Machine (SVM), KNearest Neighbor (KNN)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2720
655 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1244
654 Application of Artificial Neural Network to Classification Surface Water Quality

Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul

Abstract:

Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.

Keywords: artificial neural network, classification, surface water quality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3189
653 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: Affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, Signal Detection Theory, student engagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1234
652 An Effective Islanding Detection and Classification Method Using Neuro-Phase Space Technique

Authors: Aziah Khamis, H. Shareef

Abstract:

The purpose of planned islanding is to construct a power island during system disturbances which are commonly formed for maintenance purpose. However, in most of the cases island mode operation is not allowed. Therefore distributed generators (DGs) must sense the unplanned disconnection from the main grid. Passive technique is the most commonly used method for this purpose. However, it needs improvement in order to identify the islanding condition. In this paper an effective method for identification of islanding condition based on phase space and neural network techniques has been developed. The captured voltage waveforms at the coupling points of DGs are processed to extract the required features. For this purposed a method known as the phase space techniques is used. Based on extracted features, two neural network configuration namely radial basis function and probabilistic neural networks are trained to recognize the waveform class. According to the test result, the investigated technique can provide satisfactory identification of the islanding condition in the distribution system.

Keywords: Classification, Islanding detection, Neural network, Phase space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2116