Search results for: statistical classifiers.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1353

Search results for: statistical classifiers.

1113 A Method for Modeling Multiple Antenna Channels

Authors: S. Rajabi, M. ArdebiliPoor, M. Shahabadi

Abstract:

In this paper we propose a method for modeling the correlation between the received signals by two or more antennas operating in a multipath environment. Considering the maximum excess delay in the channel being modeled, an elliptical region surrounding both transmitter and receiver antennas is produced. A number of scatterers are randomly distributed in this region and scatter the incoming waves. The amplitude and phase of incoming waves are computed and used to obtain statistical properties of the received signals. This model has the distinguishable advantage of being applicable for any configuration of antennas. Furthermore the common PDF (Probability Distribution Function) of received wave amplitudes for any pair of antennas can be calculated and used to produce statistical parameters of received signals.

Keywords: MIMO (Multiple Input Multiple Output), SIMO (Single Input Multiple Output), GBSBEM (Geometrically Based Single Bounce Elliptical Model).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
1112 Analytical and Statistical Study of the Parameters of Expansive Soil

Authors: A. Medjnoun, R. Bahar

Abstract:

The disorders caused by the shrinking-swelling phenomenon are prevalent in arid and semi-arid in the presence of swelling clay. This soil has the characteristic of changing state under the effect of water solicitation (wetting and drying). A set of geotechnical parameters is necessary for the characterization of this soil type, such as state parameters, physical and chemical parameters and mechanical parameters. Some of these tests are very long and some are very expensive, hence the use or methods of predictions. The complexity of this phenomenon and the difficulty of its characterization have prompted researchers to use several identification parameters in the prediction of swelling potential. This document is an analytical and statistical study of geotechnical parameters affecting the potential of swelling clays. This work is performing on a database obtained from investigations swelling Algerian soil. The obtained observations have helped us to understand the soil swelling structure and its behavior.

Keywords: Analysis, estimated model, parameter identification, Swelling of clay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1289
1111 Effect of Heat Treatment on the Portevin-Le Chatelier Effect of Al-2.5%Mg Alloy

Authors: A. Chatterjee, A. Sarkar, N. Gayathri, P. Mukherjee, P. Barat

Abstract:

An experimental study is presented on the effect of microstructural change on the Portevin-Le Chatelier effect behaviour of Al-2.5%Mg alloy. Tensile tests are performed on the as received and heat treated (at 400 ºC for 16 hours) samples for a wide range of strain rates. The serrations observed in the stress-time curve are investigated from statistical analysis point of view. Microstructures of the samples are characterized by optical metallography and X-ray diffraction. It is found that the excess vacancy generated due to heat treatment leads to decrease in the strain rate sensitivity and the increase in the number of stress drop occurrences per unit time during the PLC effect. The microstructural parameters like domain size, dislocation density have no appreciable effect on the PLC effect as far as the statistical behavior of the serrations is considered.

Keywords: Dynamic strain ageing, Heat treatment, Portevin-LeChatelier effect

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2233
1110 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
1109 A Hybrid GMM/SVM System for Text Independent Speaker Identification

Authors: Rafik Djemili, Mouldi Bedda, Hocine Bourouba

Abstract:

This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers' space into small subsets of speakers within a hierarchical tree structure. During testing a speech token is assigned to its corresponding group and evaluation using gaussian mixture models (GMMs) is then processed. Experimental results show that the proposed method can significantly improve the performance of text independent speaker identification task. We report improvements of up to 50% reduction in identification error rate compared to the baseline statistical model.

Keywords: Speaker identification, Gaussian mixture model (GMM), support vector machine (SVM), hybrid GMM/SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2237
1108 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Authors: A. Perolini

Abstract:

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597
1107 One-Class Support Vector Machines for Protein-Protein Interactions Prediction

Authors: Hany Alashwal, Safaai Deris, Razib M. Othman

Abstract:

Predicting protein-protein interactions represent a key step in understanding proteins functions. This is due to the fact that proteins usually work in context of other proteins and rarely function alone. Machine learning techniques have been applied to predict protein-protein interactions. However, most of these techniques address this problem as a binary classification problem. Although it is easy to get a dataset of interacting proteins as positive examples, there are no experimentally confirmed non-interacting proteins to be considered as negative examples. Therefore, in this paper we solve this problem as a one-class classification problem using one-class support vector machines (SVM). Using only positive examples (interacting protein pairs) in training phase, the one-class SVM achieves accuracy of about 80%. These results imply that protein-protein interaction can be predicted using one-class classifier with comparable accuracy to the binary classifiers that use artificially constructed negative examples.

Keywords: Bioinformatics, Protein-protein interactions, One-Class Support Vector Machines

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1989
1106 Experimental Investigation of On-Body Channel Modelling at 2.45 GHz

Authors: Hasliza A. Rahim, Fareq Malek, Nur A. M. Affendi, Azuwa Ali, Norshafinash Saudin, Latifah Mohamed

Abstract:

This paper presents the experimental investigation of on-body channel fading at 2.45 GHz considering two effects of the user body movement; stationary and mobile. A pair of body-worn antennas was utilized in this measurement campaign. A statistical analysis was performed by comparing the measured on-body path loss to five well-known distributions; lognormal, normal, Nakagami, Weibull and Rayleigh. The results showed that the average path loss of moving arm varied higher than the path loss in sitting position for upper-arm-to-left-chest link, up to 3.5 dB. The analysis also concluded that the Nakagami distribution provided the best fit for most of on-body static link path loss in standing still and sitting position, while the arm movement can be best described by log-normal distribution.

Keywords: On-Body channel communications, fading characteristics, statistical model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
1105 Comparing Machine Learning Estimation of Fuel Consumption of Heavy-Duty Vehicles

Authors: Victor Bodell, Lukas Ekstrom, Somayeh Aghanavesi

Abstract:

Fuel consumption (FC) is one of the key factors in determining expenses of operating a heavy-duty vehicle. A customer may therefore request an estimate of the FC of a desired vehicle. The modular design of heavy-duty vehicles allows their construction by specifying the building blocks, such as gear box, engine and chassis type. If the combination of building blocks is unprecedented, it is unfeasible to measure the FC, since this would first r equire the construction of the vehicle. This paper proposes a machine learning approach to predict FC. This study uses around 40,000 vehicles specific and o perational e nvironmental c onditions i nformation, such as road slopes and driver profiles. A ll v ehicles h ave d iesel engines and a mileage of more than 20,000 km. The data is used to investigate the accuracy of machine learning algorithms Linear regression (LR), K-nearest neighbor (KNN) and Artificial n eural n etworks (ANN) in predicting fuel consumption for heavy-duty vehicles. Performance of the algorithms is evaluated by reporting the prediction error on both simulated data and operational measurements. The performance of the algorithms is compared using nested cross-validation and statistical hypothesis testing. The statistical evaluation procedure finds that ANNs have the lowest prediction error compared to LR and KNN in estimating fuel consumption on both simulated and operational data. The models have a mean relative prediction error of 0.3% on simulated data, and 4.2% on operational data.

Keywords: Artificial neural networks, fuel consumption, machine learning, regression, statistical tests.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 829
1104 Evaluating some Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Features selection, learning with kernels, support vector machine, genetic algorithms and classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
1103 What is the Key Element for the Territory's State of Development?

Authors: J. Lonska, V. Boronenko

Abstract:

The result of process of territory-s development is the territory-s state of development (TSoD), which is pointed towards the provision and improvement of people-s life conditions. The authors offer to measure the TSoD according to their own developed model. Using the available statistical data regarding the values of model-s elements, the authors empirically show which element mainly determines the TSoD. The findings of the research showed that the key elements of the TSoD are the “Material welfare of people" and “People-s health". Performing a deeper statistical analysis of correlation between these elements, it turned out that it is not so necessary for a country to be bent on trying to increase the material growth of a territory, because a relatively high index of life expectancy at birth could be ensured also by much more modest material resources. On the other hand, the economical feedback of longer lifespan within countries with lower material performance is also relatively low.

Keywords: Development indices, health, territory's state of development, wealth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1193
1102 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: Multivariate control chart, statistical process control, one-class classification method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2269
1101 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines

Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma

Abstract:

Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.

Keywords: Road accident, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129
1100 Apoptosis Inspired Intrusion Detection System

Authors: R. Sridevi, G. Jagajothi

Abstract:

Artificial Immune Systems (AIS), inspired by the human immune system, are algorithms and mechanisms which are self-adaptive and self-learning classifiers capable of recognizing and classifying by learning, long-term memory and association. Unlike other human system inspired techniques like genetic algorithms and neural networks, AIS includes a range of algorithms modeling on different immune mechanism of the body. In this paper, a mechanism of a human immune system based on apoptosis is adopted to build an Intrusion Detection System (IDS) to protect computer networks. Features are selected from network traffic using Fisher Score. Based on the selected features, the record/connection is classified as either an attack or normal traffic by the proposed methodology. Simulation results demonstrates that the proposed AIS based on apoptosis performs better than existing AIS for intrusion detection.

Keywords: Apoptosis, Artificial Immune System (AIS), Fisher Score, KDD dataset, Network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2191
1099 Predictive Maintenance of Industrial Shredders: Efficient Operation through Real-Time Monitoring Using Statistical Machine Learning

Authors: Federico Pittino, Dominik Holzmann, Krithika Sayar-Chand, Stefan Moser, Sebastian Pliessnig, Thomas Arnold

Abstract:

The shredding of waste materials is a key step in the recycling process towards circular economy. Industrial shredders for waste processing operate in very harsh operating conditions, leading to the need of frequent maintenance of critical components. The maintenance optimization is particularly important also to increase the machine’s efficiency, thereby reducing the operational costs. In this work, a monitoring system has been developed and deployed on an industrial shredder located at a waste recycling plant in Austria. The machine has been monitored for several months and methods for predictive maintenance have been developed for two key components: the cutting knives and the drive belt. The large amount of collected data is leveraged by statistical machine learning techniques, thereby not requiring a very detailed knowledge of the machine or its live operating conditions. The results show that, despite the wide range of operating conditions, a reliable estimate of the optimal time for maintenance can be derived. Moreover, the trade-off between the cost of maintenance and the increase in power consumption due to the wear state of the monitored components of the machine is investigated. This work proves the benefits of real-time monitoring system for efficient operation of industrial shredders.

Keywords: predictive maintenance, circular economy, industrial shredder, cost optimization, statistical machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640
1098 Time-Domain Analysis of Pulse Parameters Effects on Crosstalk (In High Speed Circuits)

Authors: L. Tani, N. El Ouzzani

Abstract:

Crosstalk among interconnects and printed-circuit board (PCB) traces is a major limiting factor of signal quality in highspeed digital and communication equipments especially when fast data buses are involved. Such a bus is considered as a planar multiconductor transmission line. This paper will demonstrate how the finite difference time domain (FDTD) method provides an exact solution of the transmission-line equations to analyze the near end and the far end crosstalk. In addition, this study makes it possible to analyze the rise time effect on the near and far end voltages of the victim conductor. The paper also discusses a statistical analysis, based upon a set of several simulations. Such analysis leads to a better understanding of the phenomenon and yields useful information.

Keywords: Multiconductor transmission line, Crosstalk, Finite difference time domain (FDTD), printed-circuit board (PCB), Rise time, Statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1773
1097 Words Reordering based on Statistical Language Model

Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou

Abstract:

There are multiple reasons to expect that detecting the word order errors in a text will be a difficult problem, and detection rates reported in the literature are in fact low. Although grammatical rules constructed by computer linguists improve the performance of grammar checker in word order diagnosis, the repairing task is still very difficult. This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.

Keywords: Permutations filtering, Statistical languagemodel N-grams, Word order errors

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1586
1096 Statistical Process Optimization Through Multi-Response Surface Methodology

Authors: S. Raissi, R- Eslami Farsani

Abstract:

In recent years, response surface methodology (RSM) has brought many attentions of many quality engineers in different industries. Most of the published literature on robust design methodology is basically concerned with optimization of a single response or quality characteristic which is often most critical to consumers. For most products, however, quality is multidimensional, so it is common to observe multiple responses in an experimental situation. Through this paper interested person will be familiarize with this methodology via surveying of the most cited technical papers. It is believed that the proposed procedure in this study can resolve a complex parameter design problem with more than two responses. It can be applied to those areas where there are large data sets and a number of responses are to be optimized simultaneously. In addition, the proposed procedure is relatively simple and can be implemented easily by using ready-made standard statistical packages.

Keywords: Multi-Response Surface Methodology (MRSM), Design of Experiments (DOE), Process modeling, Quality improvement; Robust Design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4458
1095 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
1094 Pattern Recognition of Partial Discharge by Using Simplified Fuzzy ARTMAP

Authors: S. Boonpoke, B. Marungsri

Abstract:

This paper presents the effectiveness of artificial intelligent technique to apply for pattern recognition and classification of Partial Discharge (PD). Characteristics of PD signal for pattern recognition and classification are computed from the relation of the voltage phase angle, the discharge magnitude and the repeated existing of partial discharges by using statistical and fractal methods. The simplified fuzzy ARTMAP (SFAM) is used for pattern recognition and classification as artificial intelligent technique. PDs quantities, 13 parameters from statistical method and fractal method results, are inputted to Simplified Fuzzy ARTMAP to train system for pattern recognition and classification. The results confirm the effectiveness of purpose technique.

Keywords: Partial discharges, PD Pattern recognition, PDClassification, Artificial intelligent, Simplified Fuzzy ARTMAP

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3084
1093 Various Advanced Statistical Analyses of Index Values Extracted from Outdoor Agricultural Workers Motion Data

Authors: Shinji Kawakura, Ryosuke Shibasaki

Abstract:

We have been grouping and developing various kinds of practical, promising sensing applied systems concerning agricultural advancement and technical tradition (guidance). These include advanced devices to secure real-time data related to worker motion, and we analyze by methods of various advanced statistics and human dynamics (e.g. primary component analysis, Ward system based cluster analysis, and mapping). What is more, we have been considering worker daily health and safety issues. Targeted fields are mainly common farms, meadows, and gardens. After then, we observed and discussed time-line style, changing data. And, we made some suggestions. The entire plan makes it possible to improve both the aforementioned applied systems and farms.

Keywords: Advanced statistical analysis, wearable sensing system, tradition of skill, supporting for workers, detecting crisis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
1092 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 567
1091 A Statistical Model for the Dynamics of Single Cathode Spot in Vacuum Cylindrical Cathode

Authors: Po-Wen Chen, Jin-Yu Wu, Md. Manirul Ali, Yang Peng, Chen-Te Chang, Der-Jun Jan

Abstract:

Dynamics of cathode spot has become a major part of vacuum arc discharge with its high academic interest and wide application potential. In this article, using a three-dimensional statistical model, we simulate the distribution of the ignition probability of a new cathode spot occurring in different magnetic pressure on old cathode spot surface and at different arcing time. This model for the ignition probability of a new cathode spot was proposed in two typical situations, one by the pure isotropic random walk in the absence of an external magnetic field, other by the retrograde motion in external magnetic field, in parallel with the cathode surface. We mainly focus on developed relationship between the ignition probability density distribution of a new cathode spot and the external magnetic field.

Keywords: Cathode spot, vacuum arc discharge, transverse magnetic field, random walk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
1090 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: Classification, data mining, spam filtering, naive Bayes, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497
1089 A Statistical Study on Young UAE Driver’s Behavior towards Road Safety

Authors: Sadia Afroza, Rakiba Rouf

Abstract:

Road safety and associated behaviors have received significant attention in recent years, reflecting general public concern. This paper portrays a statistical scenario of the young drivers in UAE with emphasis on various concern points of young driver’s behavior and license issuance. Although there are many factors contributing to road accidents, statistically it is evident that age plays a major role in road accidents. Despite ensuring strict road safety laws enforced by the UAE government, there is a staggering correlation among road accidents and young driver’s at UAE. However, private organizations like BMW and RoadSafetyUAE have extended its support on conducting surveys on driver’s behavior with an aim to ensure road safety. Various strategies such as road safety law enforcement, license issuance, adapting new technologies like safety cameras and raising awareness can be implemented to improve the road safety concerns among young drivers.

Keywords: Driving behavior, GLDS, road safety, UAE drivers, young drivers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378
1088 The Effectiveness of Ultrasound Treatment on the Germination Stimulation of Barley Seed and its Alpha-Amylase Activity

Authors: M. Yaldagard, S.A. Mortazavi, F. Tabatabaie

Abstract:

In the present study, the effects of ultrasound as emerging technology were investigated on germination stimulation, amount of alpha-amylase activity on dry barley seeds before steeping stage of malting process. All experiments were carried out at 20 KHz on the ultrasonic generator in 3 different ultrasonic intensities (20, 60 and 100% setting from total power of device) and time (5, 10 and 15 min) at constant temperature (30C). For determining the effects of these parameters on enzyme the Fuwa method assay based on the decreased staining value of blue starch–iodine complexes employed for measurement an activity. The results of these assays were analyzed by Qualitek4 software using the Taguchi statistical method to evaluate the factor-s effects on enzyme activity. It has been found that when malting barley is irradiated with an ultrasonic power, a stimulating effect occurs as to the enzyme activity.

Keywords: ultrasound, alpha-amylase activity, stimulationand Taguchi statistical method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4618
1087 Review of Downscaling Methods in Climate Change and Their Role in Hydrological Studies

Authors: Nishi Bhuvandas, P. V. Timbadiya, P. L. Patel, P. D. Porey

Abstract:

Recent perceived climate variability raises concerns with unprecedented hydrological phenomena and extremes. Distribution and circulation of the waters of the Earth become increasingly difficult to determine because of additional uncertainty related to anthropogenic emissions. The world wide observed changes in the large-scale hydrological cycle have been related to an increase in the observed temperature over several decades. Although the effect of change in climate on hydrology provides a general picture of possible hydrological global change, new tools and frameworks for modelling hydrological series with nonstationary characteristics at finer scales, are required for assessing climate change impacts. Of the downscaling techniques, dynamic downscaling is usually based on the use of Regional Climate Models (RCMs), which generate finer resolution output based on atmospheric physics over a region using General Circulation Model (GCM) fields as boundary conditions. However, RCMs are not expected to capture the observed spatial precipitation extremes at a fine cell scale or at a basin scale. Statistical downscaling derives a statistical or empirical relationship between the variables simulated by the GCMs, called predictors, and station-scale hydrologic variables, called predictands. The main focus of the paper is on the need for using statistical downscaling techniques for projection of local hydrometeorological variables under climate change scenarios. The projections can be then served as a means of input source to various hydrologic models to obtain streamflow, evapotranspiration, soil moisture and other hydrological variables of interest.

Keywords: Climate Change, Downscaling, GCM, RCM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3375
1086 The Wijma Delivery Expectancy/Experience Questionnaire (W-DEQ) with Turkish Sample: Confirmatory and Exploratory Factor Analysis

Authors: Oznur Korukcu, Kamile Kukulu, Mehmet Z. Firat

Abstract:

The propose of this study is to investigate the factor structures of the W-DEQ, originally developed on UK and Swedish women, were confirmed in Turkish samples, and to obtain a new modified factor structure appropriate to Turkish culture. Statistical analyses of the data obtained were performed using SPSS© for Windows version 13.0 and the SAS statistical software Version 9.1. Both confirmatory and exploratory factor analysis of W-DEQ were performed in the study. Factor analysis yielded four factors related to hope, fear, lack of positive anticipation and riskiness. The alpha estimates of the total W-DEQ score were somewhat higher, being 0.92 for the parous and 0.90 for the nulliparous sample. These are well above the accepted limit of 0.70 and indicate excellent levels of internal reliability, thus showing that the questions were appropriate to the Turkish culture and useful scale for the evaluation of fear of childbirth in Turkish pregnants.

Keywords: Confirmatory factor analysis, cross-cultural research, exploratory factor analysis, fear of childbirth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5127
1085 A Hybrid Ontology Based Approach for Ranking Documents

Authors: Sarah Motiee, Azadeh Nematzadeh, Mehrnoush Shamsfard

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques to extract phrases from documents and the query and doing stemming on words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done flexible and in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1630
1084 A New Heuristic Statistical Methodology for Optimizing Queuing Networks Using Discreet Event Simulation

Authors: Mohamad Mahdavi

Abstract:

Most of the real queuing systems include special properties and constraints, which can not be analyzed directly by using the results of solved classical queuing models. Lack of Markov chains features, unexponential patterns and service constraints, are the mentioned conditions. This paper represents an applied general algorithm for analysis and optimizing the queuing systems. The algorithm stages are described through a real case study. It is consisted of an almost completed non-Markov system with limited number of customers and capacities as well as lots of common exception of real queuing networks. Simulation is used for optimizing this system. So introduced stages over the following article include primary modeling, determining queuing system kinds, index defining, statistical analysis and goodness of fit test, validation of model and optimizing methods of system with simulation.

Keywords: Estimation, queuing system, simulation model, probability distribution, non-Markov chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620