Search results for: classification of big data actors
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8127

Search results for: classification of big data actors

7707 Differential Protection for Power Transformer Using Wavelet Transform and PNN

Authors: S. Sendilkumar, B. L. Mathur, Joseph Henry

Abstract:

A new approach for protection of power transformer is presented using a time-frequency transform known as Wavelet transform. Different operating conditions such as inrush, Normal, load, External fault and internal fault current are sampled and processed to obtain wavelet coefficients. Different Operating conditions provide variation in wavelet coefficients. Features like energy and Standard deviation are calculated using Parsevals theorem. These features are used as inputs to PNN (Probabilistic neural network) for fault classification. The proposed algorithm provides more accurate results even in the presence of noise inputs and accurately identifies inrush and fault currents. Overall classification accuracy of the proposed method is found to be 96.45%. Simulation of the fault (with and without noise) was done using MATLAB AND SIMULINK software taking 2 cycles of data window (40 m sec) containing 800 samples. The algorithm was evaluated by using 10 % Gaussian white noise.

Keywords: Power Transformer, differential Protection, internalfault, inrush current, Wavelet Energy, Db9.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3117
7706 Adaptive Naïve Bayesian Anti-Spam Engine

Authors: Wojciech P. Gajewski

Abstract:

The problem of spam has been seriously troubling the Internet community during the last few years and currently reached an alarming scale. Observations made at CERN (European Organization for Nuclear Research located in Geneva, Switzerland) show that spam mails can constitute up to 75% of daily SMTP traffic. A naïve Bayesian classifier based on a Bag Of Words representation of an email is widely used to stop this unwanted flood as it combines good performance with simplicity of the training and classification processes. However, facing the constantly changing patterns of spam, it is necessary to assure online adaptability of the classifier. This work proposes combining such a classifier with another NBC (naïve Bayesian classifier) based on pairs of adjacent words. Only the latter will be retrained with examples of spam reported by users. Tests are performed on considerable sets of mails both from public spam archives and CERN mailboxes. They suggest that this architecture can increase spam recall without affecting the classifier precision as it happens when only the NBC based on single words is retrained.

Keywords: Text classification, naïve Bayesian classification, spam, email.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4407
7705 A Trainable Neural Network Ensemble for ECG Beat Classification

Authors: Atena Sajedin, Shokoufeh Zakernejad, Soheil Faridi, Mehrdad Javadi, Reza Ebrahimpour

Abstract:

This paper illustrates the use of a combined neural network model for classification of electrocardiogram (ECG) beats. We present a trainable neural network ensemble approach to develop customized electrocardiogram beat classifier in an effort to further improve the performance of ECG processing and to offer individualized health care. We process a three stage technique for detection of premature ventricular contraction (PVC) from normal beats and other heart diseases. This method includes a denoising, a feature extraction and a classification. At first we investigate the application of stationary wavelet transform (SWT) for noise reduction of the electrocardiogram (ECG) signals. Then feature extraction module extracts 10 ECG morphological features and one timing interval feature. Then a number of multilayer perceptrons (MLPs) neural networks with different topologies are designed. The performance of the different combination methods as well as the efficiency of the whole system is presented. Among them, Stacked Generalization as a proposed trainable combined neural network model possesses the highest recognition rate of around 95%. Therefore, this network proves to be a suitable candidate in ECG signal diagnosis systems. ECG samples attributing to the different ECG beat types were extracted from the MIT-BIH arrhythmia database for the study.

Keywords: ECG beat Classification; Combining Classifiers;Premature Ventricular Contraction (PVC); Multi Layer Perceptrons;Wavelet Transform

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2205
7704 Network State Classification based on the Statistical properties of RTT for an Adaptive Multi-State Proactive Transport Protocol for Satellite based Networks

Authors: Mohanchur Sakar, K.K.Shukla, K.S.Dasgupta

Abstract:

This paper attempts to establish the fact that Multi State Network Classification is essential for performance enhancement of Transport protocols over Satellite based Networks. A model to classify Multi State network condition taking into consideration both congestion and channel error is evolved. In order to arrive at such a model an analysis of the impact of congestion and channel error on RTT values has been carried out using ns2. The analysis results are also reported in the paper. The inference drawn from this analysis is used to develop a novel statistical RTT based model for multi state network classification. An Adaptive Multi State Proactive Transport Protocol consisting of Proactive Slow Start, State based Error Recovery, Timeout Action and Proactive Reduction is proposed which uses the multi state network state classification model. This paper also confirms through detail simulation and analysis that a prior knowledge about the overall characteristics of the network helps in enhancing the performance of the protocol over satellite channel which is significantly affected due to channel noise and congestion. The necessary augmentation of ns2 simulator is done for simulating the multi state network classification logic. This simulation has been used in detail evaluation of the protocol under varied levels of congestion and channel noise. The performance enhancement of this protocol with reference to established protocols namely TCP SACK and Vegas has been discussed. The results as discussed in this paper clearly reveal that the proposed protocol always outperforms its peers and show a significant improvement in very high error conditions as envisaged in the design of the protocol.

Keywords: GEO, ns2, Proactive TCP, SACK, Vegas

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423
7703 Tidal Data Analysis using ANN

Authors: Ritu Vijay, Rekha Govil

Abstract:

The design of a complete expansion that allows for compact representation of certain relevant classes of signals is a central problem in signal processing applications. Achieving such a representation means knowing the signal features for the purpose of denoising, classification, interpolation and forecasting. Multilayer Neural Networks are relatively a new class of techniques that are mathematically proven to approximate any continuous function arbitrarily well. Radial Basis Function Networks, which make use of Gaussian activation function, are also shown to be a universal approximator. In this age of ever-increasing digitization in the storage, processing, analysis and communication of information, there are numerous examples of applications where one needs to construct a continuously defined function or numerical algorithm to approximate, represent and reconstruct the given discrete data of a signal. Many a times one wishes to manipulate the data in a way that requires information not included explicitly in the data, which is done through interpolation and/or extrapolation. Tidal data are a very perfect example of time series and many statistical techniques have been applied for tidal data analysis and representation. ANN is recent addition to such techniques. In the present paper we describe the time series representation capabilities of a special type of ANN- Radial Basis Function networks and present the results of tidal data representation using RBF. Tidal data analysis & representation is one of the important requirements in marine science for forecasting.

Keywords: ANN, RBF, Tidal Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1646
7702 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include the cost of infrastructure, personnel, and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion, and various indirect costs in terms of air transport. This research aims to predict the probabilistic crash prediction of vehicles using Machine Learning due to natural and structural reasons by excluding spontaneous reasons, like overspeeding, etc., in the United States. These factors range from meteorological elements such as weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity, to human-made structures, like road structure components such as Bumps, Roundabouts, No Exit, Turning Loops, Give Away, etc. The probabilities are categorized into ten distinct classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes in all states collected by the US government. The probability of the crash was determined by employing Multinomial Expected Value, and a classification label was assigned accordingly. We applied three classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-depth insights through exploratory data analysis.

Keywords: Road safety, crash prediction, exploratory analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35
7701 Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Authors: Kanthida Kusonmano, Michael Netzer, Bernhard Pfeifer, Christian Baumgartner, Klaus R. Liedl, Armin Graber

Abstract:

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.

Keywords: Classification, High dimensional data, Machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2371
7700 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
7699 Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs

Authors: Pilar Rey-del-Castillo, Jesús Cardeñosa

Abstract:

There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.

Keywords: Classifier, imputation techniques, fuzzy systems, fuzzy min-max neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760
7698 Real-time Laser Monitoring based on Pipe Detective Operation

Authors: Mongkorn Klingajay, Tawatchai Jitson

Abstract:

The pipe inspection operation is the difficult detective performance. Almost applications are mainly relies on a manual recognition of defective areas that have carried out detection by an engineer. Therefore, an automation process task becomes a necessary in order to avoid the cost incurred in such a manual process. An automated monitoring method to obtain a complete picture of the sewer condition is proposed in this work. The focus of the research is the automated identification and classification of discontinuities in the internal surface of the pipe. The methodology consists of several processing stages including image segmentation into the potential defect regions and geometrical characteristic features. Automatic recognition and classification of pipe defects are carried out by means of using an artificial neural network technique (ANN) based on Radial Basic Function (RBF). Experiments in a realistic environment have been conducted and results are presented.

Keywords: Artificial neural network, Radial basic function, Curve fitting, CCTV, Image segmentation, Data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1804
7697 A Genetic Algorithm Based Classification Approach for Finding Fault Prone Classes

Authors: Parvinder S. Sandhu, Satish Kumar Dhiman, Anmol Goyal

Abstract:

Fault-proneness of a software module is the probability that the module contains faults. A correlation exists between the fault-proneness of the software and the measurable attributes of the code (i.e. the static metrics) and of the testing (i.e. the dynamic metrics). Early detection of fault-prone software components enables verification experts to concentrate their time and resources on the problem areas of the software system under development. This paper introduces Genetic Algorithm based software fault prediction models with Object-Oriented metrics. The contribution of this paper is that it has used Metric values of JEdit open source software for generation of the rules for the classification of software modules in the categories of Faulty and non faulty modules and thereafter empirically validation is performed. The results shows that Genetic algorithm approach can be used for finding the fault proneness in object oriented software components.

Keywords: Genetic Algorithms, Software Fault, Classification, Object Oriented Metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285
7696 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: Molluca Collision Zone, partition regions, conventional statistical methods, Earthquakes, classifications, disaster management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1969
7695 Optimized Facial Features-based Age Classification

Authors: Md. Zahangir Alom, Mei-Lan Piao, Md. Shariful Islam, Nam Kim, Jae-Hyeung Park

Abstract:

The evaluation and measurement of human body dimensions are achieved by physical anthropometry. This research was conducted in view of the importance of anthropometric indices of the face in forensic medicine, surgery, and medical imaging. The main goal of this research is to optimization of facial feature point by establishing a mathematical relationship among facial features and used optimize feature points for age classification. Since selected facial feature points are located to the area of mouth, nose, eyes and eyebrow on facial images, all desire facial feature points are extracted accurately. According this proposes method; sixteen Euclidean distances are calculated from the eighteen selected facial feature points vertically as well as horizontally. The mathematical relationships among horizontal and vertical distances are established. Moreover, it is also discovered that distances of the facial feature follows a constant ratio due to age progression. The distances between the specified features points increase with respect the age progression of a human from his or her childhood but the ratio of the distances does not change (d = 1 .618 ) . Finally, according to the proposed mathematical relationship four independent feature distances related to eight feature points are selected from sixteen distances and eighteen feature point-s respectively. These four feature distances are used for classification of age using Support Vector Machine (SVM)-Sequential Minimal Optimization (SMO) algorithm and shown around 96 % accuracy. Experiment result shows the proposed system is effective and accurate for age classification.

Keywords: 3D Face Model, Face Anthropometrics, Facial Features Extraction, Feature distances, SVM-SMO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2036
7694 Nonparametric Control Chart Using Density Weighted Support Vector Data Description

Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek

Abstract:

In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.

Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110
7693 Multiple Mental Thought Parametric Classification: A New Approach for Individual Identification

Authors: Ramaswamy Palaniappan

Abstract:

This paper reports a new approach on identifying the individuality of persons by using parametric classification of multiple mental thoughts. In the approach, electroencephalogram (EEG) signals were recorded when the subjects were thinking of one or more (up to five) mental thoughts. Autoregressive features were computed from these EEG signals and classified by Linear Discriminant classifier. The results here indicate that near perfect identification of 400 test EEG patterns from four subjects was possible, thereby opening up a new avenue in biometrics.

Keywords: Autoregressive, Biometrics, Electroencephalogram, Linear discrimination, Mental thoughts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1386
7692 WebAppShield: An Approach Exploiting Machine Learning to Detect SQLi Attacks in an Application Layer in Run-Time

Authors: Ahmed Abdulla Ashlam, Atta Badii, Frederic Stahl

Abstract:

In recent years, SQL injection attacks have been identified as being prevalent against web applications. They affect network security and user data, which leads to a considerable loss of money and data every year. This paper presents the use of classification algorithms in machine learning using a method to classify the login data filtering inputs into "SQLi" or "Non-SQLi,” thus increasing the reliability and accuracy of results in terms of deciding whether an operation is an attack or a valid operation. A method as a Web-App is developed for auto-generated data replication to provide a twin of the targeted data structure. Shielding against SQLi attacks (WebAppShield) that verifies all users and prevents attackers (SQLi attacks) from entering and or accessing the database, which the machine learning module predicts as "Non-SQLi", has been developed. A special login form has been developed with a special instance of the data validation; this verification process secures the web application from its early stages. The system has been tested and validated, and up to 99% of SQLi attacks have been prevented.

Keywords: SQL injection, attacks, web application, accuracy, database, WebAppShield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 423
7691 EEG Spikes Detection, Sorting, and Localization

Authors: Mazin Z. Othman, Maan M. Shaker, Mohammed F. Abdullah

Abstract:

This study introduces a new method for detecting, sorting, and localizing spikes from multiunit EEG recordings. The method combines the wavelet transform, which localizes distinctive spike features, with Super-Paramagnetic Clustering (SPC) algorithm, which allows automatic classification of the data without assumptions such as low variance or Gaussian distributions. Moreover, the method is capable of setting amplitude thresholds for spike detection. The method makes use of several real EEG data sets, and accordingly the spikes are detected, clustered and their times were detected.

Keywords: EEG time localizations, EEG spike detection, superparamagnetic algorithm, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2542
7690 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 517
7689 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work, we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: Transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 119
7688 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910
7687 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: Convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
7686 Weed Classification using Histogram Maxima with Threshold for Selective Herbicide Applications

Authors: Irshad Ahmad, Abdul Muhamin Naeem, Muhammad Islam, Shahid Nawaz

Abstract:

Information on weed distribution within the field is necessary to implement spatially variable herbicide application. Since hand labor is costly, an automated weed control system could be feasible. This paper deals with the development of an algorithm for real time specific weed recognition system based on Histogram Maxima with threshold of an image that is used for the weed classification. This algorithm is specifically developed to classify images into broad and narrow class for real-time selective herbicide application. The developed system has been tested on weeds in the lab, which have shown that the system to be very effectiveness in weed identification. Further the results show a very reliable performance on images of weeds taken under varying field conditions. The analysis of the results shows over 95 percent classification accuracy over 140 sample images (broad and narrow) with 70 samples from each category of weeds.

Keywords: Image processing, real-time recognition, weeddetection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2149
7685 Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

Authors: L. Salhi, M. Talbi, A. Cherif

Abstract:

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

Keywords: Formants, Neural Networks, Pathological Voices, Pitch, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2834
7684 An Evaluation of Algorithms for Single-Echo Biosonar Target Classification

Authors: Turgay Temel, John Hallam

Abstract:

A recent neurospiking coding scheme for feature extraction from biosonar echoes of various plants is examined with avariety of stochastic classifiers. Feature vectors derived are employedin well-known stochastic classifiers, including nearest-neighborhood,single Gaussian and a Gaussian mixture with EM optimization.Classifiers' performances are evaluated by using cross-validation and bootstrapping techniques. It is shown that the various classifers perform equivalently and that the modified preprocessing configuration yields considerably improved results.

Keywords: Classification, neuro-spike coding, non-parametricmodel, parametric model, Gaussian mixture, EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656
7683 An Overview of the Porosity Classification in Carbonate Reservoirs and Their Challenges: An Example of Macro-Microporosity Classification from Offshore Miocene Carbonate in Central Luconia, Malaysia

Authors: Hammad T. Janjuhah, Josep Sanjuan, Mohamed K. Salah

Abstract:

Biological and chemical activities in carbonates are responsible for the complexity of the pore system. Primary porosity is generally of natural origin while secondary porosity is subject to chemical reactivity through diagenetic processes. To understand the integrated part of hydrocarbon exploration, it is necessary to understand the carbonate pore system. However, the current porosity classification scheme is limited to adequately predict the petrophysical properties of different reservoirs having various origins and depositional environments. Rock classification provides a descriptive method for explaining the lithofacies but makes no significant contribution to the application of porosity and permeability (poro-perm) correlation. The Central Luconia carbonate system (Malaysia) represents a good example of pore complexity (in terms of nature and origin) mainly related to diagenetic processes which have altered the original reservoir. For quantitative analysis, 32 high-resolution images of each thin section were taken using transmitted light microscopy. The quantification of grains, matrix, cement, and macroporosity (pore types) was achieved using a petrographic analysis of thin sections and FESEM images. The point counting technique was used to estimate the amount of macroporosity from thin section, which was then subtracted from the total porosity to derive the microporosity. The quantitative observation of thin sections revealed that the mouldic porosity (macroporosity) is the dominant porosity type present, whereas the microporosity seems to correspond to a sum of 40 to 50% of the total porosity. It has been proven that these Miocene carbonates contain a significant amount of microporosity, which significantly complicates the estimation and production of hydrocarbons. Neglecting its impact can increase uncertainty about estimating hydrocarbon reserves. Due to the diversity of geological parameters, the application of existing porosity classifications does not allow a better understanding of the poro-perm relationship. However, the classification can be improved by including the pore types and pore structures where they can be divided into macro- and microporosity. Such studies of microporosity identification/classification represent now a major concern in limestone reservoirs around the world.

Keywords: Carbonate reservoirs, microporosity, overview of porosity classification, reservoir characterization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 978
7682 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2682
7681 Combination of Different Classifiers for Cardiac Arrhythmia Recognition

Authors: M. R. Homaeinezhad, E. Tavakkoli, M. Habibi, S. A. Atyabi, A. Ghaffari

Abstract:

This paper describes a new supervised fusion (hybrid) electrocardiogram (ECG) classification solution consisting of a new QRS complex geometrical feature extraction as well as a new version of the learning vector quantization (LVQ) classification algorithm aimed for overcoming the stability-plasticity dilemma. Toward this objective, after detection and delineation of the major events of ECG signal via an appropriate algorithm, each QRS region and also its corresponding discrete wavelet transform (DWT) are supposed as virtual images and each of them is divided into eight polar sectors. Then, the curve length of each excerpted segment is calculated and is used as the element of the feature space. To increase the robustness of the proposed classification algorithm versus noise, artifacts and arrhythmic outliers, a fusion structure consisting of five different classifiers namely as Support Vector Machine (SVM), Modified Learning Vector Quantization (MLVQ) and three Multi Layer Perceptron-Back Propagation (MLP–BP) neural networks with different topologies were designed and implemented. The new proposed algorithm was applied to all 48 MIT–BIH Arrhythmia Database records (within–record analysis) and the discrimination power of the classifier in isolation of different beat types of each record was assessed and as the result, the average accuracy value Acc=98.51% was obtained. Also, the proposed method was applied to 6 number of arrhythmias (Normal, LBBB, RBBB, PVC, APB, PB) belonging to 20 different records of the aforementioned database (between– record analysis) and the average value of Acc=95.6% was achieved. To evaluate performance quality of the new proposed hybrid learning machine, the obtained results were compared with similar peer– reviewed studies in this area.

Keywords: Feature Extraction, Curve Length Method, SupportVector Machine, Learning Vector Quantization, Multi Layer Perceptron, Fusion (Hybrid) Classification, Arrhythmia Classification, Supervised Learning Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2216
7680 Service-Oriented Architecture for Object- Centric Information Fusion

Authors: Jeffrey A. Dunne, Kevin Ligozio

Abstract:

In many applications there is a broad variety of information relevant to a focal “object" of interest, and the fusion of such heterogeneous data types is desirable for classification and categorization. While these various data types can sometimes be treated as orthogonal (such as the hull number, superstructure color, and speed of an oil tanker), there are instances where the inference and the correlation between quantities can provide improved fusion capabilities (such as the height, weight, and gender of a person). A service-oriented architecture has been designed and prototyped to support the fusion of information for such “object-centric" situations. It is modular, scalable, and flexible, and designed to support new data sources, fusion algorithms, and computational resources without affecting existing services. The architecture is designed to simplify the incorporation of legacy systems, support exact and probabilistic entity disambiguation, recognize and utilize multiple types of uncertainties, and minimize network bandwidth requirements.

Keywords: Data fusion, distributed computing, service-oriented architecture, SOA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1454
7679 Classification of Radio Communication Signals using Fuzzy Logic

Authors: Zuzana Dideková, Beata Mikovičová

Abstract:

Characterization of radio communication signals aims at automatic recognition of different characteristics of radio signals in order to detect their modulation type, the central frequency, and the level. Our purpose is to apply techniques used in image processing in order to extract pertinent characteristics. To the single analysis, we add several rules for checking the consistency of hypotheses using fuzzy logic. This allows taking into account ambiguity and uncertainty that may remain after the extraction of individual characteristics. The aim is to improve the process of radio communications characterization.

Keywords: fuzzy classification, fuzzy inference system, radio communication signals, telecommunications

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963
7678 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: Classification, falls, health risk factors, machine learning, older adults.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1035