Search results for: binary classification tree
921 Radar Hydrology: New Z/R Relationships for Klang River Basin Malaysia based on Rainfall Classification
Authors: R. Suzana, T. Wardah, A.B. Sahol Hamid
Abstract:
The use of radar in Quantitative Precipitation Estimation (QPE) for radar-rainfall measurement is significantly beneficial. Radar has advantages in terms of high spatial and temporal condition in rainfall measurement and also forecasting. In Malaysia, radar application in QPE is still new and needs to be explored. This paper focuses on the Z/R derivation works of radarrainfall estimation based on rainfall classification. The works developed new Z/R relationships for Klang River Basin in Selangor area for three different general classes of rain events, namely low (<10mm/hr), moderate (>10mm/hr, <30mm/hr) and heavy (>30mm/hr) and also on more specific rain types during monsoon seasons. Looking at the high potential of Doppler radar in QPE, the newly formulated Z/R equations will be useful in improving the measurement of rainfall for any hydrological application, especially for flood forecasting.
Keywords: Radar, Quantitative Precipitation Estimation, Z/R development, flood forecasting
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2151920 Classification of Acoustic Emission Based Partial Discharge in Oil Pressboard Insulation System Using Wavelet Analysis
Authors: Prasanta Kundu, N.K. Kishore, A.K. Sinha
Abstract:
Insulation used in transformer is mostly oil pressboard insulation. Insulation failure is one of the major causes of catastrophic failure of transformers. It is established that partial discharges (PD) cause insulation degradation and premature failure of insulation. Online monitoring of PDs can reduce the risk of catastrophic failure of transformers. There are different techniques of partial discharge measurement like, electrical, optical, acoustic, opto-acoustic and ultra high frequency (UHF). Being non invasive and non interference prone, acoustic emission technique is advantageous for online PD measurement. Acoustic detection of p.d. is based on the retrieval and analysis of mechanical or pressure signals produced by partial discharges. Partial discharges are classified according to the origin of discharges. Their effects on insulation deterioration are different for different types. This paper reports experimental results and analysis for classification of partial discharges using acoustic emission signal of laboratory simulated partial discharges in oil pressboard insulation system using three different electrode systems. Acoustic emission signal produced by PD are detected by sensors mounted on the experimental tank surface, stored on an oscilloscope and fed to computer for further analysis. The measured AE signals are analyzed using discrete wavelet transform analysis and wavelet packet analysis. Energy distribution in different frequency bands of discrete wavelet decomposed signal and wavelet packet decomposed signal is calculated. These analyses show a distinct feature useful for PD classification. Wavelet packet analysis can sort out any misclassification arising out of DWT in most cases.
Keywords: Acoustic emission, discrete wavelet transform, partial discharge, wavelet packet analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2988919 A Survey: Clustering Ensembles Techniques
Authors: Reza Ghaemi , Md. Nasir Sulaiman , Hamidah Ibrahim , Norwati Mustapha
Abstract:
The clustering ensembles combine multiple partitions generated by different clustering algorithms into a single clustering solution. Clustering ensembles have emerged as a prominent method for improving robustness, stability and accuracy of unsupervised classification solutions. So far, many contributions have been done to find consensus clustering. One of the major problems in clustering ensembles is the consensus function. In this paper, firstly, we introduce clustering ensembles, representation of multiple partitions, its challenges and present taxonomy of combination algorithms. Secondly, we describe consensus functions in clustering ensembles including Hypergraph partitioning, Voting approach, Mutual information, Co-association based functions and Finite mixture model, and next explain their advantages, disadvantages and computational complexity. Finally, we compare the characteristics of clustering ensembles algorithms such as computational complexity, robustness, simplicity and accuracy on different datasets in previous techniques.Keywords: Clustering Ensembles, Combinational Algorithm, Consensus Function, Unsupervised Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3452918 Classification of Earthquake Distribution in the Banda Sea Collision Zone with Point Process Approach
Authors: Henry J. Wattimanela, Udjianna S. Pasaribu, Nanang T. Puspito, Sapto W. Indratno
Abstract:
Banda Sea Collision Zone (BSCZ) is the result of the interaction and convergence of Indo-Australian plate, Eurasian plate and Pacific plate. This location is located in eastern Indonesia. This zone has a very high seismic activity. In this research, we will calculate the rate (λ) and Mean Square Error (MSE). By this result, we will classification earthquakes distribution in the BSCZ with the point process approach. Chi-square is used to determine the type of earthquakes distribution in the sub region of BSCZ. The data used in this research is data of earthquakes with a magnitude ≥ 6 SR for the period 1964-2013 and sourced from BMKG Jakarta. This research is expected to contribute to the Moluccas Province and surrounding local governments in performing spatial plan document related to disaster management.Keywords: Banda sea collision zone, earthquakes, mean square error, Poisson distribution, chi-square test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117917 Lamb Wave Wireless Communication in Healthy Plates Using Coherent Demodulation
Authors: Rudy Bahouth, Farouk Benmeddour, Emmanuel Moulin, Jamal Assaad
Abstract:
Guided ultrasonic waves are used in Non-Destructive Testing and Structural Health Monitoring for inspection and damage detection. Recently, wireless data transmission using ultrasonic waves in solid metallic channels has gained popularity in some industrial applications such as nuclear, aerospace and smart vehicles. The idea is to find a good substitute for electromagnetic waves since they are highly attenuated near metallic components due to Faraday shielding. The proposed solution is to use ultrasonic guided waves such as Lamb waves as an information carrier due to their capability of propagation for long distances. In addition to this, valuable information about the health of the structure could be extracted simultaneously. In this work, the reliable frequency bandwidth for communication is extracted experimentally from dispersion curves at first. Then, an experimental platform for wireless communication using Lamb waves is described and built. After this, coherent demodulation algorithm used in telecommunications is tested for Amplitude Shift Keying, On-Off Keying and Binary Phase Shift Keying modulation techniques. Signal processing parameters such as threshold choice, number of cycles per bit and Bit Rate are optimized. Experimental results are compared based on the average bit error percentage. Results has shown high sensitivity to threshold selection for Amplitude Shift Keying and On-Off Keying techniques resulting a Bit Rate decrease. Binary Phase Shift Keying technique shows the highest stability and data rate between all tested modulation techniques.
Keywords: Lamb Wave Communication, wireless communication, coherent demodulation, bit error percentage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 561916 Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters
Authors: Svetlana Cojocaru, Mircea Petic, Inga Titchiev
Abstract:
Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.Keywords: Lexicon of disasters, modelling, Petri nets, text annotation, social disasters.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1157915 Knowledge Discovery and Data Mining Techniques in Textile Industry
Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler
Abstract:
This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.Keywords: Data mining, textile production, decision trees, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539914 Fault Classification of Double Circuit Transmission Line Using Artificial Neural Network
Authors: Anamika Jain, A. S. Thoke, R. N. Patel
Abstract:
This paper addresses the problems encountered by conventional distance relays when protecting double-circuit transmission lines. The problems arise principally as a result of the mutual coupling between the two circuits under different fault conditions; this mutual coupling is highly nonlinear in nature. An adaptive protection scheme is proposed for such lines based on application of artificial neural network (ANN). ANN has the ability to classify the nonlinear relationship between measured signals by identifying different patterns of the associated signals. One of the key points of the present work is that only current signals measured at local end have been used to detect and classify the faults in the double circuit transmission line with double end infeed. The adaptive protection scheme is tested under a specific fault type, but varying fault location, fault resistance, fault inception angle and with remote end infeed. An improved performance is experienced once the neural network is trained adequately, which performs precisely when faced with different system parameters and conditions. The entire test results clearly show that the fault is detected and classified within a quarter cycle; thus the proposed adaptive protection technique is well suited for double circuit transmission line fault detection & classification. Results of performance studies show that the proposed neural network-based module can improve the performance of conventional fault selection algorithms.
Keywords: Double circuit transmission line, Fault detection and classification, High impedance fault and Artificial Neural Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3187913 DIFFER: A Propositionalization approach for Learning from Structured Data
Authors: Thashmee Karunaratne, Henrik Böstrom
Abstract:
Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.Keywords: Machine learning, Structure classification, Propositionalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1223912 Classification System for a Collaborative Urban Retail Logistics
Authors: Volker Lange, Stephanie Moede, Christiane Auffermann
Abstract:
From an economic standpoint the current and future road traffic situation in urban areas is a cost factor. Traffic jams and congestion prolong journey times and tie up resources in trucks and personnel. Many discussions about imposing charges or tolls for cities in Europe in order to reduce traffic congestion are currently in progress. Both of these effects lead – directly or indirectly - to additional costs for the urban distribution systems in retail companies. One approach towards improving the efficiency of retail distribution systems, and thus towards avoiding negative environmental factors in urban areas, is horizontal collaboration for deliveries to retail outlets – Urban Retail Logistics. This paper presents a classification system to help reveal where cooperation between retail companies is possible and makes sense for deliveries to retail outlets in urban areas.
Keywords: City Logistics, Horizontal Collaboration, Urban Freight Transport, Urban Retail Logistics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2358911 The Imaging Methods for Classifying Crispiness of Freeze-Dried Durian using Fuzzy Logic
Authors: Sitthichon Kanitthakun, Pinit Kumhom, Kosin Chamnongthai
Abstract:
In quality control of freeze-dried durian, crispiness is a key quality index of the product. Generally, crispy testing has to be done by a destructive method. A nondestructive testing of the crispiness is required because the samples can be reused for other kinds of testing. This paper proposed a crispiness classification method of freeze-dried durians using fuzzy logic for decision making. The physical changes of a freeze-dried durian include the pores appearing in the images. Three physical features including (1) the diameters of pores, (2) the ratio of the pore area and the remaining area, and (3) the distribution of the pores are considered to contribute to the crispiness. The fuzzy logic is applied for making the decision. The experimental results comparing with food expert opinion showed that the accuracy of the proposed classification method is 83.33 percent.Keywords: Durian, crispiness, freeze drying, pore, fuzzy logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1970910 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines
Authors: A. Perolini
Abstract:
Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597909 Improvement of Data Transfer over Simple Object Access Protocol (SOAP)
Authors: Khaled Ahmed Kadouh, Kamal Ali Albashiri
Abstract:
This paper presents a designed algorithm involves improvement of transferring data over Simple Object Access Protocol (SOAP). The aim of this work is to establish whether using SOAP in exchanging XML messages has any added advantages or not. The results showed that XML messages without SOAP take longer time and consume more memory, especially with binary data.
Keywords: JAX-WS, SMTP, SOAP, Web service, XML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123908 Determining the Gender of Korean Names for Pronoun Generation
Authors: Seong-Bae Park, Hee-Geun Yoon
Abstract:
It is an important task in Korean-English machine translation to classify the gender of names correctly. When a sentence is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus, in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method yields the improvement in accuracy over the simple looking-up of the collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed method is more plausible for the gender classification of the Korean names.Keywords: machine translation, natural language processing, gender of proper nouns, statistical method
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2368907 Development of Better Quality Low-Cost Activated Carbon from South African Pine Tree (Pinus patula) Sawdust: Characterization and Comparative Phenol Adsorption
Authors: L. Mukosha, M. S. Onyango, A. Ochieng, H. Kasaini
Abstract:
The remediation of water resources pollution in developing countries requires the application of alternative sustainable cheaper and efficient end-of-pipe wastewater treatment technologies. The feasibility of use of South African cheap and abundant pine tree (Pinus patula) sawdust for development of lowcost AC of comparable quality to expensive commercial ACs in the abatement of water pollution was investigated. AC was developed at optimized two-stage N2-superheated steam activation conditions in a fixed bed reactor, and characterized for proximate and ultimate properties, N2-BET surface area, pore size distribution, SEM, pHPZC and FTIR. The sawdust pyrolysis activation energy was evaluated by TGA. Results indicated that the chars prepared at 800oC and 2hrs were suitable for development of better quality AC at 800oC and 47% burn-off having BET surface area (1086m2/g), micropore volume (0.26cm3/g), and mesopore volume (0.43cm3/g) comparable to expensive commercial ACs, and suitable for water contaminants removal. The developed AC showed basic surface functionality at pHPZC at 10.3, and a phenol adsorption capacity that was higher than that of commercial Norit (RO 0.8) AC. Thus, it is feasible to develop better quality low-cost AC from (Pinus patula) sawdust using twostage N2-steam activation in fixed-bed reactor.
Keywords: Activated carbon, phenol adsorption, sawdust integrated utilization, economical wastewater treatment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3470906 Integrated ACOR/IACOMV-R-SVM Algorithm
Authors: Hiba Basim Alwan, Ku Ruhana Ku-Mahamud
Abstract:
A direction for ACO is to optimize continuous and mixed (discrete and continuous) variables in solving problems with various types of data. Support Vector Machine (SVM), which originates from the statistical approach, is a present day classification technique. The main problems of SVM are selecting feature subset and tuning the parameters. Discretizing the continuous value of the parameters is the most common approach in tuning SVM parameters. This process will result in loss of information which affects the classification accuracy. This paper presents two algorithms that can simultaneously tune SVM parameters and select the feature subset. The first algorithm, ACOR-SVM, will tune SVM parameters, while the second IACOMV-R-SVM algorithm will simultaneously tune SVM parameters and select the feature subset. Three benchmark UCI datasets were used in the experiments to validate the performance of the proposed algorithms. The results show that the proposed algorithms have good performances as compared to other approaches.Keywords: Continuous ant colony optimization, incremental continuous ant colony, simultaneous optimization, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 880905 Level of Acceptability of Moringa oleifera Diversified Products among Rural and Urban Dwellers in Nigeria
Authors: Mojisola F. Oyewole, Franscisca T. Adetoro, Nkiru T. Meludu
Abstract:
Moringa oleifera is a nutritious vegetable tree with varieties of potential uses, as almost every part of the Moringa oleifera tree can be used for food. This study was conducted in Oyo State, Nigeria, to find out the level of acceptability of Moringa oleifera diversified products among rural and urban dwellers. Purposive sampling was used to select two local governments’ areas. Stratified sampling technique was also used to select one community each from rural and urban areas while snowball sampling technique was used to select ten respondents each from the two communities, making a total number of forty respondents. Data were analyzed using frequencies, percentages, Chi-square, Pearson Product Moment Correlation and regression analysis. Result from the study revealed that majority of the respondents (80%) fell within the age range of 20-49 years and 55% of them were male, 55% were married, 70% of them were Christians, 80% of them had tertiary education. The result also showed that 85% were aware of the Moringa plant and (65%) of them have consumed Moringa oleifera and the perception statements on the benefits of Moringa oleifera indicated that (52.5%) of the respondents rated Moringa oleifera to be favorable, most of them had high acceptability for Moringa egusi soup, Moringa tea, Moringa pap and yam pottage with Moringa. The result of the hypotheses testing showed that there is a significant relationship between sex of the respondents and acceptability of the diversified Moringa oleifera products (x2=6.465, p = 0.011). There is also a significant relationship between family size of the respondents level of acceptability of the Moringa oleifera products (r = 0.327, p = 0.040). Based on the level of acceptability of Moringa oleifera diversified products; the plant is of great economic importance to the populace. Therefore, there should be more public awareness through the media to enlighten people on the beneficial effects of Moringa oleifera.Keywords: Acceptability, Moringa oleifera, Diversified, Product, Dwellers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612904 Analysis of the EEG Signal for a Practical Biometric System
Authors: Muhammad Kamil Abdullah, Khazaimatol S Subari, Justin Leo Cheang Loong, Nurul Nadia Ahmad
Abstract:
This paper discusses the effectiveness of the EEG signal for human identification using four or less of channels of two different types of EEG recordings. Studies have shown that the EEG signal has biometric potential because signal varies from person to person and impossible to replicate and steal. Data were collected from 10 male subjects while resting with eyes open and eyes closed in 5 separate sessions conducted over a course of two weeks. Features were extracted using the wavelet packet decomposition and analyzed to obtain the feature vectors. Subsequently, the neural networks algorithm was used to classify the feature vectors. Results show that, whether or not the subjects- eyes were open are insignificant for a 4– channel biometrics system with a classification rate of 81%. However, for a 2–channel system, the P4 channel should not be included if data is acquired with the subjects- eyes open. It was observed that for 2– channel system using only the C3 and C4 channels, a classification rate of 71% was achieved.Keywords: Biometric, EEG, Wavelet Packet Decomposition, NeuralNetworks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3028903 Optimal Multilayer Perceptron Structure For Classification of HIV Sub-Type Viruses
Authors: Zeyneb Kurt, Oguzhan Yavuz
Abstract:
The feature of HIV genome is in a wide range because of it is highly heterogeneous. Hence, the infection ability of the virus changes related with different chemokine receptors. From this point, R5 and X4 HIV viruses use CCR5 and CXCR5 coreceptors respectively while R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the coreceptors of HIV genome. The aim of this study is to develop the optimal Multilayer Perceptron (MLP) for high classification accuracy of HIV sub-type viruses. To accomplish this purpose, the unit number in hidden layer was incremented one by one, from one to a particular number. The statistical data of R5X4, R5 and X4 viruses was preprocessed by the signal processing methods. Accessible residues of these virus sequences were extracted and modeled by Auto-Regressive Model (AR) due to the dimension of residues is large and different from each other. Finally the pre-processed dataset was used to evolve MLP with various number of hidden units to determine R5X4 viruses. Furthermore, ROC analysis was used to figure out the optimal MLP structure.Keywords: Multilayer Perceptron, Auto-Regressive Model, HIV, ROC Analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440902 Fake Account Detection in Twitter Based on Minimum Weighted Feature set
Authors: Ahmed El Azab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny
Abstract:
Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, and then the determined factors are applied using different classification techniques. A comparison of the results of these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent researches in the same area; this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts; moreover, the study can be applied on different social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.Keywords: Fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5838901 Evaluating some Feature Selection Methods for an Improved SVM Classifier
Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).
Keywords: Features selection, learning with kernels, support vector machine, genetic algorithms and classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539900 Real-Time Testing of Steel Strip Welds based on Bayesian Decision Theory
Authors: Julio Molleda, Daniel F. García, Juan C. Granda, Francisco J. Suárez
Abstract:
One of the main trouble in a steel strip manufacturing line is the breakage of whatever weld carried out between steel coils, that are used to produce the continuous strip to be processed. A weld breakage results in a several hours stop of the manufacturing line. In this process the damages caused by the breakage must be repaired. After the reparation and in order to go on with the production it will be necessary a restarting process of the line. For minimizing this problem, a human operator must inspect visually and manually each weld in order to avoid its breakage during the manufacturing process. The work presented in this paper is based on the Bayesian decision theory and it presents an approach to detect, on real-time, steel strip defective welds. This approach is based on quantifying the tradeoffs between various classification decisions using probability and the costs that accompany such decisions.Keywords: Classification, Pattern Recognition, ProbabilisticReasoning, Statistical Data Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411899 A Study on Finding Similar Document with Multiple Categories
Authors: R. Saraçoğlu, N. Allahverdi
Abstract:
Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.
Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707898 Automatic Classification of the Stand-to-Sit Phase in the TUG Test Using Machine Learning
Authors: Y. A. Adla, R. Soubra, M. Kasab, M. O. Diab, A. Chkeir
Abstract:
Over the past several years, researchers have shown a great interest in assessing the mobility of elderly people to measure their functional status. Usually, such an assessment is done by conducting tests that require the subject to walk a certain distance, turn around, and finally sit back down. Consequently, this study aims to provide an at home monitoring system to assess the patient’s status continuously. Thus, we proposed a technique to automatically detect when a subject sits down while walking at home. In this study, we utilized a Doppler radar system to capture the motion of the subjects. More than 20 features were extracted from the radar signals out of which 11 were chosen based on their Intraclass Correlation Coefficient (ICC > 0.75). Accordingly, the sequential floating forward selection wrapper was applied to further narrow down the final feature vector. Finally, five features were introduced to the Linear Discriminant Analysis classifier and an accuracy of 93.75% was achieved as well as a precision and recall of 95% and 90% respectively.
Keywords: Doppler radar system, stand-to-sit phase, TUG test, machine learning, classification
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 454897 Differential Protection for Power Transformer Using Wavelet Transform and PNN
Authors: S. Sendilkumar, B. L. Mathur, Joseph Henry
Abstract:
A new approach for protection of power transformer is presented using a time-frequency transform known as Wavelet transform. Different operating conditions such as inrush, Normal, load, External fault and internal fault current are sampled and processed to obtain wavelet coefficients. Different Operating conditions provide variation in wavelet coefficients. Features like energy and Standard deviation are calculated using Parsevals theorem. These features are used as inputs to PNN (Probabilistic neural network) for fault classification. The proposed algorithm provides more accurate results even in the presence of noise inputs and accurately identifies inrush and fault currents. Overall classification accuracy of the proposed method is found to be 96.45%. Simulation of the fault (with and without noise) was done using MATLAB AND SIMULINK software taking 2 cycles of data window (40 m sec) containing 800 samples. The algorithm was evaluated by using 10 % Gaussian white noise.Keywords: Power Transformer, differential Protection, internalfault, inrush current, Wavelet Energy, Db9.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3131896 Classification of Initial Stripe Height Patterns using Radial Basis Function Neural Network for Proportional Gain Prediction
Authors: Prasit Wonglersak, Prakarnkiat Youngkong, Ittipon Cheowanish
Abstract:
This paper aims to improve a fine lapping process of hard disk drive (HDD) lapping machines by removing materials from each slider together with controlling the strip height (SH) variation to minimum value. The standard deviation is the key parameter to evaluate the strip height variation, hence it is minimized. In this paper, a design of experiment (DOE) with factorial analysis by twoway analysis of variance (ANOVA) is adopted to obtain a statistically information. The statistics results reveal that initial stripe height patterns affect the final SH variation. Therefore, initial SH classification using a radial basis function neural network is implemented to achieve the proportional gain prediction.Keywords: Stripe height variation, Two-way analysis ofvariance (ANOVA), Radial basis function neural network, Proportional gain prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1647895 Discrimination of Seismic Signals Using Artificial Neural Networks
Authors: Mohammed Benbrahim, Adil Daoudi, Khalid Benjelloun, Aomar Ibenbrahim
Abstract:
The automatic discrimination of seismic signals is an important practical goal for earth-science observatories due to the large amount of information that they receive continuously. An essential discrimination task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, two classes of seismic signals recorded routinely in geophysical laboratory of the National Center for Scientific and Technical Research in Morocco are considered. They correspond to signals associated to local earthquakes and chemical explosions. The approach adopted for the development of an automatic discrimination system is a modular system composed by three blocs: 1) Representation, 2) Dimensionality reduction and 3) Classification. The originality of our work consists in the use of a new wavelet called "modified Mexican hat wavelet" in the representation stage. For the dimensionality reduction, we propose a new algorithm based on the random projection and the principal component analysis.Keywords: Seismic signals, Wavelets, Dimensionality reduction, Artificial neural networks, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634894 Aliveness Detection of Fingerprints using Multiple Static Features
Authors: Heeseung Choi, Raechoong Kang, Kyungtaek Choi, Jaihie Kim
Abstract:
Fake finger submission attack is a major problem in fingerprint recognition systems. In this paper, we introduce an aliveness detection method based on multiple static features, which derived from a single fingerprint image. The static features are comprised of individual pore spacing, residual noise and several first order statistics. Specifically, correlation filter is adopted to address individual pore spacing. The multiple static features are useful to reflect the physiological and statistical characteristics of live and fake fingerprint. The classification can be made by calculating the liveness scores from each feature and fusing the scores through a classifier. In our dataset, we compare nine classifiers and the best classification rate at 85% is attained by using a Reduced Multivariate Polynomial classifier. Our approach is faster and more convenient for aliveness check for field applications.Keywords: Aliveness detection, Fingerprint recognition, individual pore spacing, multiple static features, residual noise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926893 Profit and Nonprofit Sports Clubs: Financial and Organizational Comparison in Poland
Authors: Wojciech B. Cieśliński, Igor Perechuda
Abstract:
The paper identifies the features of Polish sports clubs in the particular organizational forms: profit and nonprofit. Identification and description of these features is carried out in terms of financial efficiency of the given organizational form. Under the terms of the efficiency the research allows you to specify the advantages of particular organizational sports club form and the following limitations. Paper considers features of sports clubs in range of Polish conditions as legal regulations. The sources of the functioning efficiency of sports clubs may lie in the organizational forms in which they operate. Each of the available forms can be considered either a for-profit or nonprofit enterprise. Depending on this classification there are different capabilities of increasing organizational and financial efficiency of a given sports club. Authors start with general classification and difference between for-profit and non-profit sport clubs. Next identifies specific financial and organizational conditions of both organizational form and then show examples of mixed activity forms and their efficiency effect.Keywords: Financial efficiency, for-profit, non-profit, sports club.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2039892 A World Map of Seabed Sediment Based on 50 Years of Knowledge
Authors: T. Garlan, I. Gabelotaud, S. Lucas, E. Marchès
Abstract:
Production of a global sedimentological seabed map has been initiated in 1995 to provide the necessary tool for searches of aircraft and boats lost at sea, to give sedimentary information for nautical charts, and to provide input data for acoustic propagation modelling. This original approach had already been initiated one century ago when the French hydrographic service and the University of Nancy had produced maps of the distribution of marine sediments of the French coasts and then sediment maps of the continental shelves of Europe and North America. The current map of the sediment of oceans presented was initiated with a UNESCO's general map of the deep ocean floor. This map was adapted using a unique sediment classification to present all types of sediments: from beaches to the deep seabed and from glacial deposits to tropical sediments. In order to allow good visualization and to be adapted to the different applications, only the granularity of sediments is represented. The published seabed maps are studied, if they present an interest, the nature of the seabed is extracted from them, the sediment classification is transcribed and the resulted map is integrated in the world map. Data come also from interpretations of Multibeam Echo Sounder (MES) imagery of large hydrographic surveys of deep-ocean. These allow a very high-quality mapping of areas that until then were represented as homogeneous. The third and principal source of data comes from the integration of regional maps produced specifically for this project. These regional maps are carried out using all the bathymetric and sedimentary data of a region. This step makes it possible to produce a regional synthesis map, with the realization of generalizations in the case of over-precise data. 86 regional maps of the Atlantic Ocean, the Mediterranean Sea, and the Indian Ocean have been produced and integrated into the world sedimentary map. This work is permanent and permits a digital version every two years, with the integration of some new maps. This article describes the choices made in terms of sediment classification, the scale of source data and the zonation of the variability of the quality. This map is the final step in a system comprising the Shom Sedimentary Database, enriched by more than one million punctual and surface items of data, and four series of coastal seabed maps at 1:10,000, 1:50,000, 1:200,000 and 1:1,000,000. This step by step approach makes it possible to take into account the progresses in knowledge made in the field of seabed characterization during the last decades. Thus, the arrival of new classification systems for seafloor has improved the recent seabed maps, and the compilation of these new maps with those previously published allows a gradual enrichment of the world sedimentary map. But there is still a lot of work to enhance some regions, which are still based on data acquired more than half a century ago.
Keywords: Marine sedimentology, seabed map, sediment classification, World Ocean.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1039