Search results for: automatic classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1648

Search results for: automatic classification

1108 Product Features Extraction from Opinions According to Time

Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou

Abstract:

Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.

Keywords: Opinion mining, product feature extraction, sentiment analysis, SentiWordNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304
1107 A Methodology for Characterising the Tail Behaviour of a Distribution

Authors: Serge Provost, Yishan Zang

Abstract:

Following a review of various approaches that are utilized for classifying the tail behavior of a distribution, an easily implementable methodology that relies on an arctangent transformation is presented. The classification criterion is actually based on the difference between two specific quantiles of the transformed distribution. The resulting categories enable one to classify distributional tails as distinctly short, short, nearly medium, medium, extended medium and somewhat long, providing that at least two moments exist. Distributions possessing a single moment are said to be long tailed while those failing to have any finite moments are classified as having an extremely long tail. Several illustrative examples will be presented.

Keywords: Arctangent transformation, change of variables, heavy-tailed distributions, tail classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 689
1106 Automatic Thresholding for Data Gap Detection for a Set of Sensors in Instrumented Buildings

Authors: Houda Najeh, Stéphane Ploix, Mahendra Pratap Singh, Karim Chabir, Mohamed Naceur Abdelkrim

Abstract:

Building systems are highly vulnerable to different kinds of faults and failures. In fact, various faults, failures and human behaviors could affect the building performance. This paper tackles the detection of unreliable sensors in buildings. Different literature surveys on diagnosis techniques for sensor grids in buildings have been published but all of them treat only bias and outliers. Occurences of data gaps have also not been given an adequate span of attention in the academia. The proposed methodology comprises the automatic thresholding for data gap detection for a set of heterogeneous sensors in instrumented buildings. Sensor measurements are considered to be regular time series. However, in reality, sensor values are not uniformly sampled. So, the issue to solve is from which delay each sensor become faulty? The use of time series is required for detection of abnormalities on the delays. The efficiency of the method is evaluated on measurements obtained from a real power plant: an office at Grenoble Institute of technology equipped by 30 sensors.

Keywords: Building system, time series, diagnosis, outliers, delay, data gap.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
1105 Effective Sonar Target Classification via Parallel Structure of Minimal Resource Allocation Network

Authors: W.S. Lim, M.V.C. Rao

Abstract:

In this paper, the processing of sonar signals has been carried out using Minimal Resource Allocation Network (MRAN) and a Probabilistic Neural Network (PNN) in differentiation of commonly encountered features in indoor environments. The stability-plasticity behaviors of both networks have been investigated. The experimental result shows that MRAN possesses lower network complexity but experiences higher plasticity than PNN. An enhanced version called parallel MRAN (pMRAN) is proposed to solve this problem and is proven to be stable in prediction and also outperformed the original MRAN.

Keywords: Ultrasonic sensing, target classification, minimalresource allocation network (MRAN), probabilistic neural network(PNN), stability-plasticity dilemma.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598
1104 A Model to Determine Atmospheric Stability and its Correlation with CO Concentration

Authors: Kh. Ashrafi, Gh. A. Hoshyaripour

Abstract:

Atmospheric stability plays the most important role in the transport and dispersion of air pollutants. Different methods are used for stability determination with varying degrees of complexity. Most of these methods are based on the relative magnitude of convective and mechanical turbulence in atmospheric motions. Richardson number, Monin-Obukhov length, Pasquill-Gifford stability classification and Pasquill–Turner stability classification, are the most common parameters and methods. The Pasquill–Turner Method (PTM), which is employed in this study, makes use of observations of wind speed, insolation and the time of day to classify atmospheric stability with distinguishable indices. In this study, a model is presented to determination of atmospheric stability conditions using PTM. As a case study, meteorological data of Mehrabad station in Tehran from 2000 to 2005 is applied to model. Here, three different categories are considered to deduce the pattern of stability conditions. First, the total pattern of stability classification is obtained and results show that atmosphere is 38.77%, 27.26%, 33.97%, at stable, neutral and unstable condition, respectively. It is also observed that days are mostly unstable (66.50%) while nights are mostly stable (72.55%). Second, monthly and seasonal patterns are derived and results indicate that relative frequency of stable conditions decrease during January to June and increase during June to December, while results for unstable conditions are exactly in opposite manner. Autumn is the most stable season with relative frequency of 50.69% for stable condition, whilst, it is 42.79%, 34.38% and 27.08% for winter, summer and spring, respectively. Hourly stability pattern is the third category that points out that unstable condition is dominant from approximately 03-15 GTM and 04-12 GTM for warm and cold seasons, respectively. Finally, correlation between atmospheric stability and CO concentration is achieved.

Keywords: Atmospheric stability, Pasquill-Turner classification, convective turbulence, mechanical turbulence, Tehran.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6457
1103 Automatic Sleep Stage Scoring with Wavelet Packets Based on Single EEG Recording

Authors: Luay A. Fraiwan, Natheer Y. Khaswaneh, Khaldon Y. Lweesy

Abstract:

Sleep stage scoring is the process of classifying the stage of the sleep in which the subject is in. Sleep is classified into two states based on the constellation of physiological parameters. The two states are the non-rapid eye movement (NREM) and the rapid eye movement (REM). The NREM sleep is also classified into four stages (1-4). These states and the state wakefulness are distinguished from each other based on the brain activity. In this work, a classification method for automated sleep stage scoring based on a single EEG recording using wavelet packet decomposition was implemented. Thirty two ploysomnographic recording from the MIT-BIH database were used for training and validation of the proposed method. A single EEG recording was extracted and smoothed using Savitzky-Golay filter. Wavelet packets decomposition up to the fourth level based on 20th order Daubechies filter was used to extract features from the EEG signal. A features vector of 54 features was formed. It was reduced to a size of 25 using the gain ratio method and fed into a classifier of regression trees. The regression trees were trained using 67% of the records available. The records for training were selected based on cross validation of the records. The remaining of the records was used for testing the classifier. The overall correct rate of the proposed method was found to be around 75%, which is acceptable compared to the techniques in the literature.

Keywords: Features selection, regression trees, sleep stagescoring, wavelet packets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2331
1102 A Human Activity Recognition System Based On Sensory Data Related to Object Usage

Authors: M. Abdullah-Al-Wadud

Abstract:

Sensor-based Activity Recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.

Keywords: Naïve Bayesian-based classification, Activity recognition, sensor data, object-usage model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
1101 Fuzzy Based Visual Texture Feature for Psoriasis Image Analysis

Authors: G. Murugeswari, A. Suruliandi

Abstract:

This paper proposes a rotational invariant texture feature based on the roughness property of the image for psoriasis image analysis. In this work, we have applied this feature for image classification and segmentation. The fuzzy concept is employed to overcome the imprecision of roughness. Since the psoriasis lesion is modeled by a rough surface, the feature is extended for calculating the Psoriasis Area Severity Index value. For classification and segmentation, the Nearest Neighbor algorithm is applied. We have obtained promising results for identifying affected lesions by using the roughness index and severity level estimation.

Keywords: Fuzzy texture feature, psoriasis, roughness feature, skin disease.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2118
1100 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1718
1099 Automatic Reusability Appraisal of Software Components using Neuro-fuzzy Approach

Authors: Parvinder S. Sandhu, Hardeep Singh

Abstract:

Automatic reusability appraisal could be helpful in evaluating the quality of developed or developing reusable software components and in identification of reusable components from existing legacy systems; that can save cost of developing the software from scratch. But the issue of how to identify reusable components from existing systems has remained relatively unexplored. In this paper, we have mentioned two-tier approach by studying the structural attributes as well as usability or relevancy of the component to a particular domain. Latent semantic analysis is used for the feature vector representation of various software domains. It exploits the fact that FeatureVector codes can be seen as documents containing terms -the idenifiers present in the components- and so text modeling methods that capture co-occurrence information in low-dimensional spaces can be used. Further, we devised Neuro- Fuzzy hybrid Inference System, which takes structural metric values as input and calculates the reusability of the software component. Decision tree algorithm is used to decide initial set of fuzzy rules for the Neuro-fuzzy system. The results obtained are convincing enough to propose the system for economical identification and retrieval of reusable software components.

Keywords: Clustering, ID3, LSA, Neuro-fuzzy System, SVD

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
1098 Opinion Mining Framework in the Education Domain

Authors: A. M. H. Elyasir, K. S. M. Anbananthen

Abstract:

The internet is growing larger and becoming the most popular platform for the people to share their opinion in different interests. We choose the education domain specifically comparing some Malaysian universities against each other. This comparison produces benchmark based on different criteria shared by the online users in various online resources including Twitter, Facebook and web pages. The comparison is accomplished using opinion mining framework to extract, process the unstructured text and classify the result to positive, negative or neutral (polarity). Hence, we divide our framework to three main stages; opinion collection (extraction), unstructured text processing and polarity classification. The extraction stage includes web crawling, HTML parsing, Sentence segmentation for punctuation classification, Part of Speech (POS) tagging, the second stage processes the unstructured text with stemming and stop words removal and finally prepare the raw text for classification using Named Entity Recognition (NER). Last phase is to classify the polarity and present overall result for the comparison among the Malaysian universities. The final result is useful for those who are interested to study in Malaysia, in which our final output declares clear winners based on the public opinions all over the web.

Keywords: Entity Recognition, Education Domain, Opinion Mining, Unstructured Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2968
1097 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: Sentiment analysis, term weighting scheme, 1/sigma.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 538
1096 Development and Assessment of Measuring/Rehabilitation Device for Myelopathy Patients with Lower Extremity Function

Authors: Hironobu Murayama, Shohei Shimizu, Masakazu Ohnuki, Hisanori Mihara, Tohru Kanada

Abstract:

Disordered function of maniphalanx and difficulty with ambulation will occur insofar as a human has a failure in the spinal marrow. Cervical spondylotic myelopathy as one of the myelopathy emanates from not only external factors but also increased age. In addition, the diacrisis is difficult since cervical spondylotic myelopathy is evaluated by a doctor-s neurological remark and imaging findings. As a quantitative method for measuring the degree of disability, hand-operated triangle step test (for short, TST) has formulated. In this research, a full automatic triangle step counter apparatus is designed and developed to measure the degree of disability in an accurate fashion according to the principle of TST. The step counter apparatus whose shape is a low triangle pole displays the number of stepping upon each corner. Furthermore, the apparatus has two modes of operation. Namely, one is for measuring the degree of disability and the other for rehabilitation exercise. In terms of usefulness, clinical practice should be executed before too long.

Keywords: Cervical spondylotic myelopathy, disorder of lower limbs, measuringfunction, rehabilitation function, full automatic apparatus, triangle step test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
1095 The Use of Symbolic Signs in Modern Ukrainian Monumental Church Painting: Classification and Hidden Semantics

Authors: Khlystun Yuliia Igorivna

Abstract:

Monumental church paintings are often perceived either as the interior decoration of the temple or as the "Gospel for the illiterate," as the temple painting often contains scenes from Holy Scripture. In science the painting of the Orthodox Church is mainly the subject of study of art critics, but from the point of view of culturology and semiotics, it is insufficiently studied. The symbolism of monumental church painting is insufficiently revealed. The aim of this paper is to give a description of symbolic signs, to classify them, to give examples for each type of sign from the paintings of modern temples of Eastern Ukraine, on the basis of semiotic analysis of iconographic plots used in monumental church painting. We offer own classification of symbols of monumental church painting, using examples from the murals of modern Orthodox churches in Eastern Ukraine, mainly from the Donetsk region. When analyzing the semantics of symbolic signs, the following methods of the culturological approach were used: semiotic, iconological, iconographic, hermeneutic, culturological, descriptive, comparative-historical, visual-analytical. When interpreting the meanings of symbolic signs, scientific, cultural and theological literature were used. Photos taken by the author have been added to the article.

Keywords: Iconography, painting of Orthodox Church, Orthodox Church, semiotic signs in modern iconography, classification of symbols in painting of Orthodox Church.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 404
1094 On Developing an Automatic Speech Recognition System for Standard Arabic Language

Authors: R. Walha, F. Drira, H. El-Abed, A. M. Alimi

Abstract:

The Automatic Speech Recognition (ASR) applied to Arabic language is a challenging task. This is mainly related to the language specificities which make the researchers facing multiple difficulties such as the insufficient linguistic resources and the very limited number of available transcribed Arabic speech corpora. In this paper, we are interested in the development of a HMM-based ASR system for Standard Arabic (SA) language. Our fundamental research goal is to select the most appropriate acoustic parameters describing each audio frame, acoustic models and speech recognition unit. To achieve this purpose, we analyze the effect of varying frame windowing (size and period), acoustic parameter number resulting from features extraction methods traditionally used in ASR, speech recognition unit, Gaussian number per HMM state and number of embedded re-estimations of the Baum-Welch Algorithm. To evaluate the proposed ASR system, a multi-speaker SA connected-digits corpus is collected, transcribed and used throughout all experiments. A further evaluation is conducted on a speaker-independent continue SA speech corpus. The phonemes recognition rate is 94.02% which is relatively high when comparing it with another ASR system evaluated on the same corpus.

Keywords: ASR, HMM, acoustical analysis, acoustic modeling, Standard Arabic language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
1093 Adaptive Pulse Coupled Neural Network Parameters for Image Segmentation

Authors: Thejaswi H. Raya, Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

For over a decade, the Pulse Coupled Neural Network (PCNN) based algorithms have been successfully used in image interpretation applications including image segmentation. There are several versions of the PCNN based image segmentation methods, and the segmentation accuracy of all of them is very sensitive to the values of the network parameters. Most methods treat PCNN parameters like linking coefficient and primary firing threshold as global parameters, and determine them by trial-and-error. The automatic determination of appropriate values for linking coefficient, and primary firing threshold is a challenging problem and deserves further research. This paper presents a method for obtaining global as well as local values for the linking coefficient and the primary firing threshold for neurons directly from the image statistics. Extensive simulation results show that the proposed approach achieves excellent segmentation accuracy comparable to the best accuracy obtainable by trial-and-error for a variety of images.

Keywords: Automatic Selection of PCNN Parameters, Image Segmentation, Neural Networks, Pulse Coupled Neural Network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2290
1092 File Format of Flow Chart Simulation Software - CFlow

Authors: Syahanim Mohd Salleh, Zaihosnita Hood, Hairulliza Mohd Judi, Marini Abu Bakar

Abstract:

CFlow is a flow chart software, it contains facilities to draw and evaluate a flow chart. A flow chart evaluation applies a simulation method to enable presentation of work flow in a flow chart solution. Flow chart simulation of CFlow is executed by manipulating the CFlow data file which is saved in a graphical vector format. These text-based data are organised by using a data classification technic based on a Library classification-scheme. This paper describes the file format for flow chart simulation software of CFlow.

Keywords: CFlow, flow chart, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2558
1091 Improved Body Mass Index Classification for Football Code Masters Athletes, A Comparison to the Australian National Population

Authors: Joe Walsh, Mike Climstein, Ian Timothy Heazlewood, Stephen Burke, Jyrki Kettunen, Kent Adams, Mark DeBeliso

Abstract:

Thousands of masters athletes participate quadrennially in the World Masters Games (WMG), yet this cohort of athletes remains proportionately under-investigated. Due to a growing global obesity pandemic in context of benefits of physical activity across the lifespan, the prevalence of obesity in this unique population was of particular interest. Data gathered on a sub-sample of 535 football code athletes, aged 31-72 yrs ( =47.4, s =±7.1), competing at the Sydney World Masters Games (2009) demonstrated a significantly (p<0.001), reduced classification of obesity using Body Mass Index (BMI) when compared to data on the Australian national population. This evidence of improved classification in one index of health (BMI<30) implies there are either improved levels of this index of health due to adherence to sport or possibly the reduced BMI is advantageous and contributes to this cohort adhering (or being attracted) to masters sport. Given the worldwide focus on the obesity epidemic and the need for a multi-faceted solution to this problem, demonstration of these middle to older aged adults having improved BMI over the general population is of particular interest.

Keywords: BMI, masters athlete, rugby union, soccer, touch football

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680
1090 Automatic Translation of Ada-ECATNet Using Rewriting Logic

Authors: N. Boudiaf

Abstract:

One major difficulty that faces developers of concurrent and distributed software is analysis for concurrency based faults like deadlocks. Petri nets are used extensively in the verification of correctness of concurrent programs. ECATNets are a category of algebraic Petri nets based on a sound combination of algebraic abstract types and high-level Petri nets. ECATNets have 'sound' and 'complete' semantics because of their integration in rewriting logic and its programming language Maude. Rewriting logic is considered as one of very powerful logics in terms of description, verification and programming of concurrent systems We proposed previously a method for translating Ada-95 tasking programs to ECATNets formalism (Ada-ECATNet) and we showed that ECATNets formalism provides a more compact translation for Ada programs compared to the other approaches based on simple Petri nets or Colored Petri nets. We showed also previously how the ECATNet formalism offers to Ada many validation and verification tools like simulation, Model Checking, accessibility analysis and static analysis. In this paper, we describe the implementation of our translation of the Ada programs into ECATNets.

Keywords: Ada tasking, Analysis, Automatic Translation, ECATNets, Maude, Rewriting Logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
1089 Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Authors: Kanso Hassan, Elhore Ali, Soule-dupuy Chantal, Tazi Said

Abstract:

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Keywords: Information research, text analyzes, intentionalstructure, segmentation, ontology, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641
1088 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2012
1087 Two Class Motor Imagery Classification via Wave Atom Sub-Bants

Authors: Nebi Gedik

Abstract:

The goal of motor image brain computer interface research is to create a link between the central nervous system and a computer or device. The most important signal for brain-computer interface is the electroencephalogram. The aim of this research is to explore a set of effective features from EEG signals, separated into frequency bands, using wave atom sub-bands to discriminate right and left-hand motor imagery signals. Over the transform coefficients, feature vectors are constructed for each frequency range and each transform sub-band, and their classification performances are tested. The method is validated using EEG signals from the BCI competition III dataset IIIa and classifiers such as support vector machine and k-nearest neighbors.

Keywords: motor imagery, EEG, Wave atom transform sub-bands, SVM, k-NN

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 605
1086 Traffic Density Measurement by Automatic Detection of Vehicles Using Gradient Vectors from Aerial Images

Authors: Saman Ghaffarian, Ilgın Gökasar

Abstract:

This paper presents a new automatic vehicle detection method from very high resolution aerial images to measure traffic density. The proposed method starts by extracting road regions from image using road vector data. Then, the road image is divided into equal sections considering resolution of the images. Gradient vectors of the road image are computed from edge map of the corresponding image. Gradient vectors on the each boundary of the sections are divided where the gradient vectors significantly change their directions. Finally, number of vehicles in each section is carried out by calculating the standard deviation of the gradient vectors in each group and accepting the group as vehicle that has standard deviation above predefined threshold value. The proposed method was tested in four very high resolution aerial images acquired from Istanbul, Turkey which illustrate roads and vehicles with diverse characteristics. The results show the reliability of the proposed method in detecting vehicles by producing 86% overall F1 accuracy value.

Keywords: Aerial images, intelligent transportation systems, traffic density measurement, vehicle detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2937
1085 Simulation Data Management Approach for Developing Adaptronic Systems – The W-Model Methodology

Authors: Roland S. Nattermann, Reiner Anderl

Abstract:

Existing proceeding-models for the development of mechatronic systems provide a largely parallel action in the detailed development. This parallel approach is to take place also largely independent of one another in the various disciplines involved. An approach for a new proceeding-model provides a further development of existing models to use for the development of Adaptronic Systems. This approach is based on an intermediate integration and an abstract modeling of the adaptronic system. Based on this system-model a simulation of the global system behavior, due to external and internal factors or Forces is developed. For the intermediate integration a special data management system is used. According to the presented approach this data management system has a number of functions that are not part of the "normal" PDM functionality. Therefore a concept for a new data management system for the development of Adaptive system is presented in this paper. This concept divides the functions into six layers. In the first layer a system model is created, which divides the adaptronic system based on its components and the various technical disciplines. Moreover, the parameters and properties of the system are modeled and linked together with the requirements and the system model. The modeled parameters and properties result in a network which is analyzed in the second layer. From this analysis necessary adjustments to individual components for specific manipulation of the system behavior can be determined. The third layer contains an automatic abstract simulation of the system behavior. This simulation is a precursor for network analysis and serves as a filter. By the network analysis and simulation changes to system components are examined and necessary adjustments to other components are calculated. The other layers of the concept treat the automatic calculation of system reliability, the "normal" PDM-functionality and the integration of discipline-specific data into the system model. A prototypical implementation of an appropriate data management with the addition of an automatic system development is being implemented using the data management system ENOVIA SmarTeam V5 and the simulation system MATLAB.

Keywords: Adaptronic, Data-Management, LOEWE-CentreAdRIA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2371
1084 Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Authors: T. Thendral, M. S. Vijaya, S. Karpagavalli

Abstract:

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Keywords: Classification, Feature extraction, Support vector machine, Training, Writer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2315
1083 Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Authors: T. Thendral, M. S. Vijaya, S. Karpagavalli

Abstract:

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Keywords: Classification, Feature extraction, Support vector machine, Training, Writer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1702
1082 A New Composition Method of Admissible Support Vector Kernel Based on Reproducing Kernel

Authors: Wei Zhang, Xin Zhao, Yi-Fan Zhu, Xin-Jian Zhang

Abstract:

Kernel function, which allows the formulation of nonlinear variants of any algorithm that can be cast in terms of dot products, makes the Support Vector Machines (SVM) have been successfully applied in many fields, e.g. classification and regression. The importance of kernel has motivated many studies on its composition. It-s well-known that reproducing kernel (R.K) is a useful kernel function which possesses many properties, e.g. positive definiteness, reproducing property and composing complex R.K by simple operation. There are two popular ways to compute the R.K with explicit form. One is to construct and solve a specific differential equation with boundary value whose handicap is incapable of obtaining a unified form of R.K. The other is using a piecewise integral of the Green function associated with a differential operator L. The latter benefits the computation of a R.K with a unified explicit form and theoretical analysis, whereas there are relatively later studies and fewer practical computations. In this paper, a new algorithm for computing a R.K is presented. It can obtain the unified explicit form of R.K in general reproducing kernel Hilbert space. It avoids constructing and solving the complex differential equations manually and benefits an automatic, flexible and rigorous computation for more general RKHS. In order to validate that the R.K computed by the algorithm can be used in SVM well, some illustrative examples and a comparison between R.K and Gaussian kernel (RBF) in support vector regression are presented. The result shows that the performance of R.K is close or slightly superior to that of RBF.

Keywords: admissible support vector kernel, reproducing kernel, reproducing kernel Hilbert space, Green function, support vectorregression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548
1081 Determining Senses for Word Sense Disambiguation in Turkish

Authors: Zeynep Orhan, Zeynep Altan

Abstract:

Word sense disambiguation is an important intermediate stage for many natural language processing applications. The senses of an ambiguous word are the classification of usages for that specific word. This paper deals with the methodologies of determining the senses for a given word if they can not be obtained from an already available resource like WordNet. We offer a method that helps us to determine the sense boundaries gradually. In this method, first we decide on some features that are thought to be effective on the senses and divide the instances first into two, then according to the results of evaluations we continue dividing instances gradually. In a second method we use the pseudo words. We devise artificial words depending on some criteria and evaluate classification algorithms on these previously classified words.

Keywords: Word sense disambiguation, sense determination, pseudo words, sense granularity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411
1080 Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact

Authors: Emna Benmohamed, Hela Ltifi, Mounir Ben Ayed

Abstract:

Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).

Keywords: Classification, Bayesian network; structure learning, K2 algorithm, expert knowledge, surface water analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 517
1079 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data

Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop

Abstract:

Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.

Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2041