Search results for: Classification rule mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1820

Search results for: Classification rule mining

980 Multiple Targets Classification and Fuzzy Logic Decision Fusion in Wireless Sensor Networks

Authors: Ahmad Aljaafreh

Abstract:

This paper proposes a hierarchical hidden Markov model (HHMM) to model the detection of M vehicles in a wireless sensor network (WSN). The HHMM model contains an extra level of hidden Markov model to model the temporal transitions of each state of the first HMM. By modeling the temporal transitions, only those hypothesis with nonzero transition probabilities needs to be tested. Thus, this method efficiently reduces the computation load, which is preferable in WSN applications.This paper integrates several techniques to optimize the detection performance. The output of the states of the first HMM is modeled as Gaussian Mixture Model (GMM), where the number of states and the number of Gaussians are experimentally determined, while the other parameters are estimated using Expectation Maximization (EM). HHMM is used to model the sequence of the local decisions which are based on multiple hypothesis testing with maximum likelihood approach. The states in the HHMM represent various combinations of vehicles of different types. Due to the statistical advantages of multisensor data fusion, we propose a heuristic based on fuzzy weighted majority voting to enhance cooperative classification of moving vehicles within a region that is monitored by a wireless sensor network. A fuzzy inference system weighs each local decision based on the signal to noise ratio of the acoustic signal for target detection and the signal to noise ratio of the radio signal for sensor communication. The spatial correlation among the observations of neighboring sensor nodes is efficiently utilized as well as the temporal correlation. Simulation results demonstrate the efficiency of this scheme.

Keywords: Classification, decision fusion, fuzzy logic, hidden Markov model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6226
979 Justification and Classification of Issues for the Selection and Implementation of Advanced Manufacturing Technologies

Authors: Zahra Banakar, Farzad Tahriri

Abstract:

It has often been said that the strength of any country resides in the strength of its industrial sector, and Progress in industrial society has been accomplished by the creation of new technologies. Developments have been facilitated by the increasing availability of advanced manufacturing technology (AMT), in addition the implementation of advanced manufacturing technology (AMT) requires careful planning at all levels of the organization to ensure that the implementation will achieve the intended goals. Justification and implementation of advanced manufacturing technology (AMT) involves decisions that are crucial for the practitioners regarding the survival of business in the present days of uncertain manufacturing world. This paper assists the industrial managers to consider all the important criteria for success AMT implementation, when purchasing new technology. Concurrently, this paper classifies the tangible benefits of a technology that are evaluated by addressing both cost and time dimensions, and the intangible benefits are evaluated by addressing technological, strategic, social and human issues to identify and create awareness of the essential elements in the AMT implementation process and identify the necessary actions before implementing AMT.

Keywords: Advanced Manufacturing Technology (AMT), Justification and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2509
978 The Integrated Management of Health Care Strategies and Differential Diagnosis by Expert System Technology: A Single-Dimensional Approach

Authors: A. B. Adehor, P. R. Burrell

Abstract:

The Integrated Management of Child illnesses (IMCI) and the surveillance Health Information Systems (HIS) are related strategies that are designed to manage child illnesses and community practices of diseases. However, both strategies do not function well together because of classification incompatibilities and, as such, are difficult to use by health care personnel in rural areas where a majority of people lack the basic knowledge of interpreting disease classification from these methods. This paper discusses a single approach on how a stand-alone expert system can be used as a prompt diagnostic tool for all cases of illnesses presented. The system combines the action-oriented IMCI and the disease-oriented HIS approaches to diagnose malaria and typhoid fever in the rural areas of the Niger-delta region.

Keywords: Differential diagnosis, Health Information System(HIS), Integrated Management of Child Illnesses (IMCI), Malaria andTyphoid fever.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846
977 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfil requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: Single classifier, machine learning, ensemble learning, multi-sensor target tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 568
976 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1888
975 An Intelligent Combined Method Based on Power Spectral Density, Decision Trees and Fuzzy Logic for Hydraulic Pumps Fault Diagnosis

Authors: Kaveh Mollazade, Hojat Ahmadi, Mahmoud Omid, Reza Alimardani

Abstract:

Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. The aim of this work is to investigate the effectiveness of a new fault diagnosis method based on power spectral density (PSD) of vibration signals in combination with decision trees and fuzzy inference system (FIS). To this end, a series of studies was conducted on an external gear hydraulic pump. After a test under normal condition, a number of different machine defect conditions were introduced for three working levels of pump speed (1000, 1500, and 2000 rpm), corresponding to (i) Journal-bearing with inner face wear (BIFW), (ii) Gear with tooth face wear (GTFW), and (iii) Journal-bearing with inner face wear plus Gear with tooth face wear (B&GW). The features of PSD values of vibration signal were extracted using descriptive statistical parameters. J48 algorithm is used as a feature selection procedure to select pertinent features from data set. The output of J48 algorithm was employed to produce the crisp if-then rule and membership function sets. The structure of FIS classifier was then defined based on the crisp sets. In order to evaluate the proposed PSD-J48-FIS model, the data sets obtained from vibration signals of the pump were used. Results showed that the total classification accuracy for 1000, 1500, and 2000 rpm conditions were 96.42%, 100%, and 96.42% respectively. The results indicate that the combined PSD-J48-FIS model has the potential for fault diagnosis of hydraulic pumps.

Keywords: Power Spectral Density, Machine ConditionMonitoring, Hydraulic Pump, Fuzzy Logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2667
974 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783
973 Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

Authors: Mohamed Ali KAMMOUN, Ahmed Ben HAMIDA

Abstract:

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Keywords: Unit selection, Corpus-based Speech Synthesis, Bigram model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416
972 Reduced Rule Based Fuzzy Logic Controlled Isolated Bidirectional Converter Operating in Extended Phase Shift Control for Bidirectional Energy Transfer

Authors: Anupam Kumar, Abdul Hamid Bhat, Pramod Agarwal

Abstract:

Bidirectional energy transfer capability with high efficiency and reduced cost is fast gaining prominence in the central part of a lot of power conversion systems in Direct Current (DC) microgrid. Preferably, under the economics constraints, these systems utilise a single high efficiency power electronics conversion system and a dual active bridge converter. In this paper, modeling and performance of Dual Active Bridge (DAB) converter with Extended Phase Shift (EPS) is evaluated with two batteries on both sides of DC bus and bidirectional energy transfer is facilitated and this is further compared with the Single Phase Shift (SPS) mode of operation. Optimum operating zone is identified through exhaustive simulations using MATLAB/Simulink and SimPowerSystem software. Reduced rules based fuzzy logic controller is implemented for closed loop control of DAB converter. The control logic enables the bidirectional energy transfer within the batteries even at lower duty ratios. Charging and discharging of batteries is supervised by the fuzzy logic controller. State of charge, current and voltage for both the batteries are plotted in the battery characteristics. Power characteristics of batteries are also obtained using MATLAB simulations.

Keywords: Fuzzy logic controller, rule base, membership functions, dual active bridge converter, bidirectional power flow, duty ratio, extended phase shift, state of charge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 836
971 Early-Warning Lights Classification Management System for Industrial Parks in Taiwan

Authors: Yu-Min Chang, Kuo-Sheng Tsai, Hung-Te Tsai, Chia-Hsin Li

Abstract:

This paper presents the early-warning lights classification management system for industrial parks promoted by the Taiwan Environmental Protection Administration (EPA) since 2011, including the definition of each early-warning light, objectives, action program and accomplishments. All of the 151 industrial parks in Taiwan were classified into four early-warning lights, including red, orange, yellow and green, for carrying out respective pollution management according to the monitoring data of soil and groundwater quality, regulatory compliance, and regulatory listing of control site or remediation site. The Taiwan EPA set up a priority list for high potential polluted industrial parks and investigated their soil and groundwater qualities based on the results of the light classification and pollution potential assessment. In 2011-2013, there were 44 industrial parks selected and carried out different investigation, such as the early warning groundwater well networks establishment and pollution investigation/verification for the red and orange-light industrial parks and the environmental background survey for the yellow-light industrial parks. Among them, 22 industrial parks were newly or continuously confirmed that the concentrations of pollutants exceeded those in soil or groundwater pollution control standards. Thus, the further investigation, groundwater use restriction, listing of pollution control site or remediation site, and pollutant isolation measures were implemented by the local environmental protection and industry competent authorities; the early warning lights of those industrial parks were proposed to adjust up to orange or red-light. Up to the present, the preliminary positive effect of the soil and groundwater quality management system for industrial parks has been noticed in several aspects, such as environmental background information collection, early warning of pollution risk, pollution investigation and control, information integration and application, and inter-agency collaboration. Finally, the work and goal of self-initiated quality management of industrial parks will be carried out on the basis of the inter-agency collaboration by the classified lights system of early warning and management as well as the regular announcement of the status of each industrial park.

Keywords: Industrial park, soil and groundwater quality management, early-warning lights classification, SOP for reporting and treatment of monitored abnormal events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963
970 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1297
969 Real-time Laser Monitoring based on Pipe Detective Operation

Authors: Mongkorn Klingajay, Tawatchai Jitson

Abstract:

The pipe inspection operation is the difficult detective performance. Almost applications are mainly relies on a manual recognition of defective areas that have carried out detection by an engineer. Therefore, an automation process task becomes a necessary in order to avoid the cost incurred in such a manual process. An automated monitoring method to obtain a complete picture of the sewer condition is proposed in this work. The focus of the research is the automated identification and classification of discontinuities in the internal surface of the pipe. The methodology consists of several processing stages including image segmentation into the potential defect regions and geometrical characteristic features. Automatic recognition and classification of pipe defects are carried out by means of using an artificial neural network technique (ANN) based on Radial Basic Function (RBF). Experiments in a realistic environment have been conducted and results are presented.

Keywords: Artificial neural network, Radial basic function, Curve fitting, CCTV, Image segmentation, Data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
968 Functional Near Infrared Spectroscope for Cognition Brain Tasks by Wavelets Analysis and Neural Networks

Authors: Truong Quang Dang Khoa, Masahiro Nakagawa

Abstract:

Brain Computer Interface (BCI) has been recently increased in research. Functional Near Infrared Spectroscope (fNIRs) is one the latest technologies which utilize light in the near-infrared range to determine brain activities. Because near infrared technology allows design of safe, portable, wearable, non-invasive and wireless qualities monitoring systems, fNIRs monitoring of brain hemodynamics can be value in helping to understand brain tasks. In this paper, we present results of fNIRs signal analysis indicating that there exist distinct patterns of hemodynamic responses which recognize brain tasks toward developing a BCI. We applied two different mathematics tools separately, Wavelets analysis for preprocessing as signal filters and feature extractions and Neural networks for cognition brain tasks as a classification module. We also discuss and compare with other methods while our proposals perform better with an average accuracy of 99.9% for classification.

Keywords: functional near infrared spectroscope (fNIRs), braincomputer interface (BCI), wavelets, neural networks, brain activity, neuroimaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2009
967 Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Authors: T. Kumaresan, S. Sanjushree, C. Palanisamy

Abstract:

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Keywords: File Type, HSV Histogram, k-NN, RGB Histogram, Spam Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
966 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: Molluca Collision Zone, partition regions, conventional statistical methods, Earthquakes, classifications, disaster management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956
965 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: Brain-computer interface, BCI, electroencephalography, EEG, finger motion decoding, independent component analysis, pseudo-real-time motion decoding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 568
964 Machine Learning Methods for Flood Hazard Mapping

Authors: S. Zappacosta, C. Bove, M. Carmela Marinelli, P. di Lauro, K. Spasenovic, L. Ostano, G. Aiello, M. Pietrosanto

Abstract:

This paper proposes a neural network approach for assessing flood hazard mapping. The core of the model is a machine learning component fed by frequency ratios, namely statistical correlations between flood event occurrences and a selected number of topographic properties. The classification capability was compared with the flood hazard mapping River Basin Plans (Piani Assetto Idrogeologico, acronimed as PAI) designed by the Italian Institute for Environmental Research and Defence, ISPRA (Istituto Superiore per la Protezione e la Ricerca Ambientale), encoding four different increasing flood hazard levels. The study area of Piemonte, an Italian region, has been considered without loss of generality. The frequency ratios may be used as a standalone block to model the flood hazard mapping. Nevertheless, the mixture with a neural network improves the classification power of several percentage points, and may be proposed as a basic tool to model the flood hazard map in a wider scope.

Keywords: flood modeling, hazard map, neural networks, hydrogeological risk, flood risk assessment

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 679
963 Hybrid Intelligent Intrusion Detection System

Authors: Norbik Bashah, Idris Bharanidharan Shanmugam, Abdul Manan Ahmed

Abstract:

Intrusion Detection Systems are increasingly a key part of systems defense. Various approaches to Intrusion Detection are currently being used, but they are relatively ineffective. Artificial Intelligence plays a driving role in security services. This paper proposes a dynamic model Intelligent Intrusion Detection System, based on specific AI approach for intrusion detection. The techniques that are being investigated includes neural networks and fuzzy logic with network profiling, that uses simple data mining techniques to process the network data. The proposed system is a hybrid system that combines anomaly, misuse and host based detection. Simple Fuzzy rules allow us to construct if-then rules that reflect common ways of describing security attacks. For host based intrusion detection we use neural-networks along with self organizing maps. Suspicious intrusions can be traced back to its original source path and any traffic from that particular source will be redirected back to them in future. Both network traffic and system audit data are used as inputs for both.

Keywords: Intrusion Detection, Network Security, Data mining, Fuzzy Logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104
962 Granularity Analysis for Spatio-Temporal Web Sensors

Authors: Shun Hattori

Abstract:

In recent years, many researches to mine the exploding Web world, especially User Generated Content (UGC) such as weblogs, for knowledge about various phenomena and events in the physical world have been done actively, and also Web services with the Web-mined knowledge have begun to be developed for the public. However, there are few detailed investigations on how accurately Web-mined data reflect physical-world data. It must be problematic to idolatrously utilize the Web-mined data in public Web services without ensuring their accuracy sufficiently. Therefore, this paper introduces the simplest Web Sensor and spatiotemporallynormalized Web Sensor to extract spatiotemporal data about a target phenomenon from weblogs searched by keyword(s) representing the target phenomenon, and tries to validate the potential and reliability of the Web-sensed spatiotemporal data by four kinds of granularity analyses of coefficient correlation with temperature, rainfall, snowfall, and earthquake statistics per day by region of Japan Meteorological Agency as physical-world data: spatial granularity (region-s population density), temporal granularity (time period, e.g., per day vs. per week), representation granularity (e.g., “rain" vs. “heavy rain"), and media granularity (weblogs vs. microblogs such as Tweets).

Keywords: Granularity analysis, knowledge extraction, spatiotemporal data mining, Web credibility, Web mining, Web sensor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
961 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2087
960 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: Convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1376
959 Detecting and Tracking Vehicles in Airborne Videos

Authors: Hsu-Yung Cheng, Chih-Chang Yu

Abstract:

In this work, we present an automatic vehicle detection system for airborne videos using combined features. We propose a pixel-wise classification method for vehicle detection using Dynamic Bayesian Networks. In spite of performing pixel-wise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. The main novelty of the detection scheme is that the extracted combined features comprise not only pixel-level information but also region-level information. Afterwards, tracking is performed on the detected vehicles. Tracking is performed using efficient Kalman filter with dynamic particle sampling. Experiments were conducted on a wide variety of airborne videos. We do not assume prior information of camera heights, orientation, and target object sizes in the proposed framework. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging dataset.

Keywords: Vehicle Detection, Airborne Video, Tracking, Dynamic Bayesian Networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
958 Dynamic Time Warping in Gait Classificationof Motion Capture Data

Authors: Adam Świtoński, Agnieszka Michalczuk, Henryk Josiński, Andrzej Polański, KonradWojciechowski

Abstract:

The method of gait identification based on the nearest neighbor classification technique with motion similarity assessment by the dynamic time warping is proposed. The model based kinematic motion data, represented by the joints rotations coded by Euler angles and unit quaternions is used. The different pose distance functions in Euler angles and quaternion spaces are considered. To evaluate individual features of the subsequent joints movements during gait cycle, joint selection is carried out. To examine proposed approach database containing 353 gaits of 25 humans collected in motion capture laboratory is used. The obtained results are promising. The classifications, which takes into consideration all joints has accuracy over 91%. Only analysis of movements of hip joints allows to correctly identify gaits with almost 80% precision.

Keywords: Biometrics, dynamic time warping, gait identification, motion capture, time series classification, quaternion distance functions, attribute ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2590
957 Color Image Segmentation Using SVM Pixel Classification Image

Authors: K. Sakthivel, R. Nallusamy, C. Kavitha

Abstract:

The goal of image segmentation is to cluster pixels into salient image regions. Segmentation could be used for object recognition, occlusion boundary estimation within motion or stereo systems, image compression, image editing, or image database lookup. In this paper, we present a color image segmentation using support vector machine (SVM) pixel classification. Firstly, the pixel level color and texture features of the image are extracted and they are used as input to the SVM classifier. These features are extracted using the homogeneity model and Gabor Filter. With the extracted pixel level features, the SVM Classifier is trained by using FCM (Fuzzy C-Means).The image segmentation takes the advantage of both the pixel level information of the image and also the ability of the SVM Classifier. The Experiments show that the proposed method has a very good segmentation result and a better efficiency, increases the quality of the image segmentation compared with the other segmentation methods proposed in the literature.

Keywords: Image Segmentation, Support Vector Machine, Fuzzy C–Means, Pixel Feature, Texture Feature, Homogeneity model, Gabor Filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6717
956 Modified Naïve Bayes Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: Tomato yields prediction, naive Bayes, redundancy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5072
955 A New Hybrid K-Mean-Quick Reduct Algorithm for Gene Selection

Authors: E. N. Sathishkumar, K. Thangavel, T. Chandrasekhar

Abstract:

Feature selection is a process to select features which are more informative. It is one of the important steps in knowledge discovery. The problem is that all genes are not important in gene expression data. Some of the genes may be redundant, and others may be irrelevant and noisy. Here a novel approach is proposed Hybrid K-Mean-Quick Reduct (KMQR) algorithm for gene selection from gene expression data. In this study, the entire dataset is divided into clusters by applying K-Means algorithm. Each cluster contains similar genes. The high class discriminated genes has been selected based on their degree of dependence by applying Quick Reduct algorithm to all the clusters. Average Correlation Value (ACV) is calculated for the high class discriminated genes. The clusters which have the ACV value as 1 is determined as significant clusters, whose classification accuracy will be equal or high when comparing to the accuracy of the entire dataset. The proposed algorithm is evaluated using WEKA classifiers and compared. The proposed work shows that the high classification accuracy.

Keywords: Clustering, Gene Selection, K-Mean-Quick Reduct, Rough Sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2281
954 Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

Authors: L. Salhi, M. Talbi, A. Cherif

Abstract:

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

Keywords: Formants, Neural Networks, Pathological Voices, Pitch, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2819
953 Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method

Authors: S. Qaedi, S. Seyedtabaii

Abstract:

Non-Destructive evaluation of in-service power transformer condition is necessary for avoiding catastrophic failures. Dissolved Gas Analysis (DGA) is one of the important methods. Traditional, statistical and intelligent DGA approaches have been adopted for accurate classification of incipient fault sources. Unfortunately, there are not often enough faulty patterns required for sufficient training of intelligent systems. By bootstrapping the shortcoming is expected to be alleviated and algorithms with better classification success rates to be obtained. In this paper the performance of an artificial neural network, K-Nearest Neighbour and support vector machine methods using bootstrapped data are detailed and shown that while the success rate of the ANN algorithms improves remarkably, the outcome of the others do not benefit so much from the provided enlarged data space. For assessment, two databases are employed: IEC TC10 and a dataset collected from reported data in papers. High average test success rate well exhibits the remarkable outcome.

Keywords: Dissolved gas analysis, Transformer incipient fault, Artificial Neural Network, Support Vector Machine (SVM), KNearest Neighbor (KNN)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2717
952 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229
951 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: Affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, Signal Detection Theory, student engagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1226