Search results for: ensemble averaging effects.
3021 Boosting Method for Automated Feature Space Discovery in Supervised Quantum Machine Learning Models
Authors: Vladimir Rastunkov, Jae-Eun Park, Abhijit Mitra, Brian Quanz, Steve Wood, Christopher Codella, Heather Higgins, Joseph Broz
Abstract:
Quantum Support Vector Machines (QSVM) have become an important tool in research and applications of quantum kernel methods. In this work we propose a boosting approach for building ensembles of QSVM models and assess performance improvement across multiple datasets. This approach is derived from the best ensemble building practices that worked well in traditional machine learning and thus should push the limits of quantum model performance even further. We find that in some cases, a single QSVM model with tuned hyperparameters is sufficient to simulate the data, while in others - an ensemble of QSVMs that are forced to do exploration of the feature space via proposed method is beneficial.
Keywords: QSVM, Quantum Support Vector Machines, quantum kernel, boosting, ensemble.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4393020 Application of Machine Learning Methods to Online Test Error Detection in Semiconductor Test
Authors: Matthias Kirmse, Uwe Petersohn, Elief Paffrath
Abstract:
As in today's semiconductor industries test costs can make up to 50 percent of the total production costs, an efficient test error detection becomes more and more important. In this paper, we present a new machine learning approach to test error detection that should provide a faster recognition of test system faults as well as an improved test error recall. The key idea is to learn a classifier ensemble, detecting typical test error patterns in wafer test results immediately after finishing these tests. Since test error detection has not yet been discussed in the machine learning community, we define central problem-relevant terms and provide an analysis of important domain properties. Finally, we present comparative studies reflecting the failure detection performance of three individual classifiers and three ensemble methods based upon them. As base classifiers we chose a decision tree learner, a support vector machine and a Bayesian network, while the compared ensemble methods were simple and weighted majority vote as well as stacking. For the evaluation, we used cross validation and a specially designed practical simulation. By implementing our approach in a semiconductor test department for the observation of two products, we proofed its practical applicability.
Keywords: Ensemble methods, fault detection, machine learning, semiconductor test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22743019 A 3rd order 3bit Sigma-Delta Modulator with Reduced Delay Time of Data Weighted Averaging
Authors: Soon Jai Yi, Sun-Hong Kim, Hang-Geun Jeong, Seong-Ik Cho
Abstract:
This paper presents a method of reducing the feedback delay time of DWA(Data Weighted Averaging) used in sigma-delta modulators. The delay time reduction results from the elimination of the latch at the quantizer output and also from the falling edge operation. The designed sigma-delta modulator improves the timing margin about 16%. The sub-circuits of sigma-delta modulator such as SC(Switched Capacitor) integrator, 9-level quantizer, comparator, and DWA are designed with the non-ideal characteristics taken into account. The sigma-delta modulator has a maximum SNR (Signal to Noise Ratio) of 84 dB or 13 bit resolution.Keywords: Sigma-delta modulator, multibit, DWA
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24063018 Ensemble Approach for Predicting Student's Academic Performance
Authors: L. A. Muhammad, M. S. Argungu
Abstract:
Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7603017 Recommender Systems Using Ensemble Techniques
Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim
Abstract:
This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.
Keywords: Product recommender system, Ensemble technique, Association rules, Decision tree, Artificial neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42223016 Typical Day Prediction Model for Output Power and Energy Efficiency of a Grid-Connected Solar Photovoltaic System
Authors: Yan Su, L. C. Chan
Abstract:
A novel typical day prediction model have been built and validated by the measured data of a grid-connected solar photovoltaic (PV) system in Macau. Unlike conventional statistical method used by previous study on PV systems which get results by averaging nearby continuous points, the present typical day statistical method obtain the value at every minute in a typical day by averaging discontinuous points at the same minute in different days. This typical day statistical method based on discontinuous point averaging makes it possible for us to obtain the Gaussian shape dynamical distributions for solar irradiance and output power in a yearly or monthly typical day. Based on the yearly typical day statistical analysis results, the maximum possible accumulated output energy in a year with on site climate conditions and the corresponding optimal PV system running time are obtained. Periodic Gaussian shape prediction models for solar irradiance, output energy and system energy efficiency have been built and their coefficients have been determined based on the yearly, maximum and minimum monthly typical day Gaussian distribution parameters, which are obtained from iterations for minimum Root Mean Squared Deviation (RMSD). With the present model, the dynamical effects due to time difference in a day are kept and the day to day uncertainty due to weather changing are smoothed but still included. The periodic Gaussian shape correlations for solar irradiance, output power and system energy efficiency have been compared favorably with data of the PV system in Macau and proved to be an improvement than previous models.
Keywords: Grid Connected, RMSD, Solar PV System, Typical Day.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16793015 Adaptive Weighted Averaging Filter Using the Appropriate Number of Consecutive Frames
Authors: Mahmoud Saeidi, Ali Nazemipour
Abstract:
In this paper, we propose a novel adaptive spatiotemporal filter that utilizes image sequences in order to remove noise. The consecutive frames include: current, previous and next noisy frames. The filter proposed in this paper is based upon the weighted averaging pixels intensity and noise variance in image sequences. It utilizes the Appropriate Number of Consecutive Frames (ANCF) based on the noisy pixels intensity among the frames. The number of consecutive frames is adaptively calculated for each region in image and its value may change from one region to another region depending on the pixels intensity within the region. The weights are determined by a well-defined mathematical criterion, which is adaptive to the feature of spatiotemporal pixels of the consecutive frames. It is experimentally shown that the proposed filter can preserve image structures and edges under motion while suppressing noise, and thus can be effectively used in image sequences filtering. In addition, the AWA filter using ANCF is particularly well suited for filtering sequences that contain segments with abruptly changing scene content due to, for example, rapid zooming and changes in the view of the camera.Keywords: Appropriate Number of Consecutive Frames, Adaptive Weighted Averaging, Motion Estimation, Noise Variance, Motion Compensation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18193014 Glass Bottle Inspector Based on Machine Vision
Authors: Huanjun Liu, Yaonan Wang, Feng Duan
Abstract:
This text studies glass bottle intelligent inspector based machine vision instead of manual inspection. The system structure is illustrated in detail in this paper. The text presents the method based on watershed transform methods to segment the possible defective regions and extract features of bottle wall by rules. Then wavelet transform are used to exact features of bottle finish from images. After extracting features, the fuzzy support vector machine ensemble is putted forward as classifier. For ensuring that the fuzzy support vector machines have good classification ability, the GA based ensemble method is used to combining the several fuzzy support vector machines. The experiments demonstrate that using this inspector to inspect glass bottles, the accuracy rate may reach above 97.5%.Keywords: Intelligent Inspection, Support Vector Machines, Ensemble Methods, watershed transform, Wavelet Transform
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38953013 Combining Bagging and Boosting
Authors: S. B. Kotsiantis, P. E. Pintelas
Abstract:
Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Keywords: data mining, machine learning, pattern recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25623012 Design Modelling Control and Simulation of DC/DC Power Buck Converter
Authors: H. Abaali
Abstract:
The power buck converter is the most widely used DC/DC converter topology. They have a very large application area such as DC motor drives, photovoltaic power system which require fast transient responses and high efficiency over a wide range of load current. This work proposes, the modelling of DC/DC power buck converter using state-space averaging method and the current-mode control using a proportional-integral controller. The efficiency of the proposed model and control loop are evaluated with operating point changes. The simulation results proved the effectiveness of the linear model of DC/DC power buck converter.Keywords: DC/DC power buck converter, Linear current control, State-space averaging method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34793011 Long Wavelength Coherent Pulse of Sound Propagating in Granular Media
Authors: Rohit Kumar Shrivastava, Amalia Thomas, Nathalie Vriend, Stefan Luding
Abstract:
A mechanical wave or vibration propagating through granular media exhibits a specific signature in time. A coherent pulse or wavefront arrives first with multiply scattered waves (coda) arriving later. The coherent pulse is micro-structure independent i.e. it depends only on the bulk properties of the disordered granular sample, the sound wave velocity of the granular sample and hence bulk and shear moduli. The coherent wavefront attenuates (decreases in amplitude) and broadens with distance from its source. The pulse attenuation and broadening effects are affected by disorder (polydispersity; contrast in size of the granules) and have often been attributed to dispersion and scattering. To study the effect of disorder and initial amplitude (non-linearity) of the pulse imparted to the system on the coherent wavefront, numerical simulations have been carried out on one-dimensional sets of particles (granular chains). The interaction force between the particles is given by a Hertzian contact model. The sizes of particles have been selected randomly from a Gaussian distribution, where the standard deviation of this distribution is the relevant parameter that quantifies the effect of disorder on the coherent wavefront. Since, the coherent wavefront is system configuration independent, ensemble averaging has been used for improving the signal quality of the coherent pulse and removing the multiply scattered waves. The results concerning the width of the coherent wavefront have been formulated in terms of scaling laws. An experimental set-up of photoelastic particles constituting a granular chain is proposed to validate the numerical results.Keywords: Discrete elements, Hertzian Contact, polydispersity, weakly nonlinear, wave propagation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9223010 The Intuitionistic Fuzzy Ordered Weighted Averaging-Weighted Average Operator and its Application in Financial Decision Making
Authors: Shouzhen Zeng
Abstract:
We present a new intuitionistic fuzzy aggregation operator called the intuitionistic fuzzy ordered weighted averaging-weighted average (IFOWAWA) operator. The main advantage of the IFOWAWA operator is that it unifies the OWA operator with the WA in the same formulation considering the degree of importance that each concept has in the aggregation. Moreover, it is able to deal with an uncertain environment that can be assessed with intuitionistic fuzzy numbers. We study some of its main properties and we see that it has a lot of particular cases such as the intuitionistic fuzzy weighted average (IFWA) and the intuitionistic fuzzy OWA (IFOWA) operator. Finally, we study the applicability of the new approach on a financial decision making problem concerning the selection of financial strategies.Keywords: Intuitionistic fuzzy numbers, Weighted average, OWA operator, Financial decision making
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24393009 Activity Recognition by Smartphone Accelerometer Data Using Ensemble Learning Methods
Authors: Eu Tteum Ha, Kwang Ryel Ryu
Abstract:
As smartphones are equipped with various sensors, there have been many studies focused on using these sensors to create valuable applications. Human activity recognition is one such application motivated by various welfare applications, such as the support for the elderly, measurement of calorie consumption, lifestyle and exercise patterns analyses, and so on. One of the challenges one faces when using smartphone sensors for activity recognition is that the number of sensors should be minimized to save battery power. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we adopt to deal with this twelve-class problem uses various methods. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point, but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window. The experiments compared the performance of four kinds of basic multi-class classifiers and the performance of four kinds of ensemble learning methods based on three kinds of basic multi-class classifiers. The results show that while the method with the highest accuracy is ECOC based on Random forest.
Keywords: Ensemble learning, activity recognition, smartphone accelerometer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21733008 Development of an Ensemble Classification Model Based on Hybrid Filter-Wrapper Feature Selection for Email Phishing Detection
Authors: R. B. Ibrahim, M. S. Argungu, I. M. Mungadi
Abstract:
It is obvious in this present time, internet has become an indispensable part of human life since its inception. The Internet has provided diverse opportunities to make life so easy for human beings, through the adoption of various channels. Among these channels are email, internet banking, video conferencing, and the like. Email is one of the easiest means of communication hugely accepted among individuals and organizations globally. But over decades the security integrity of this platform has been challenged with malicious activities like Phishing. Email phishing is designed by phishers to fool the recipient into handing over sensitive personal information such as passwords, credit card numbers, account credentials, social security numbers, etc. This activity has caused a lot of financial damage to email users globally which has resulted in bankruptcy, sudden death of victims, and other health-related sicknesses. Although many methods have been proposed to detect email phishing, in this research, the results of multiple machine-learning methods for predicting email phishing have been compared with the use of filter-wrapper feature selection. It is worth noting that all three models performed substantially but one outperformed the other. The dataset used for these models is obtained from Kaggle online data repository, while three classifiers: decision tree, Naïve Bayes, and Logistic regression are ensemble (Bagging) respectively. Results from the study show that the Decision Tree (CART) bagging ensemble recorded the highest accuracy of 98.13% using PEF (Phishing Essential Features). This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, hybrid, filter-wrapper, phishing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783007 Noise Estimation for Speech Enhancement in Non-Stationary Environments-A New Method
Authors: Ch.V.Rama Rao, Gowthami., Harsha., Rajkumar., M.B.Rama Murthy, K.Srinivasa Rao, K.AnithaSheela
Abstract:
This paper presents a new method for estimating the nonstationary noise power spectral density given a noisy signal. The method is based on averaging the noisy speech power spectrum using time and frequency dependent smoothing factors. These factors are adjusted based on signal-presence probability in individual frequency bins. Signal presence is determined by computing the ratio of the noisy speech power spectrum to its local minimum, which is updated continuously by averaging past values of the noisy speech power spectra with a look-ahead factor. This method adapts very quickly to highly non-stationary noise environments. The proposed method achieves significant improvements over a system that uses voice activity detector (VAD) in noise estimation.Keywords: Noise estimation, Non-stationary noise, Speechenhancement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23413006 Dependent Weighted Aggregation Operators of Hesitant Fuzzy Numbers
Authors: Jing Liu
Abstract:
In this paper, motivated by the ideas of dependent weighted aggregation operators, we develop some new hesitant fuzzy dependent weighted aggregation operators to aggregate the input arguments taking the form of hesitant fuzzy numbers rather than exact numbers, or intervals. In fact, we propose three hesitant fuzzy dependent weighted averaging(HFDWA) operators, and three hesitant fuzzy dependent weighted geometric(HFDWG) operators based on different weight vectors, and the most prominent characteristic of these operators is that the associated weights only depend on the aggregated hesitant fuzzy numbers and can relieve the influence of unfair hesitant fuzzy numbers on the aggregated results by assigning low weights to those “false” and “biased” ones. Some examples are given to illustrated the efficiency of the proposed operators.
Keywords: Hesitant fuzzy numbers, hesitant fuzzy dependent weighted averaging(HFDWA) operators, hesitant fuzzy dependent weighted geometric(HFDWG) operators.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17753005 Study of Functional Relevant Conformational Mobility of β-2 Adrenoreceptor by Means of Molecular Dynamics Simulation
Authors: G. V. Novikov, V. S. Sivozhelezov, S. S. Kolesnikov, K. V. Shaitan
Abstract:
The study reports about the influence of binding of orthosteric ligands as well as point mutations on the conformational dynamics of β-2-adrenoreceptor. Using molecular dynamics simulation we found that there was a little fraction of active states of the receptor in its apo (ligand free) ensemble corresponded to its constitutive activity. Analysis of MD trajectories indicated that such spontaneous activation of the receptor is accompanied by the motion in intracellular part of its alpha-helices. Thus receptor’s constitutive activity directly results from its conformational dynamics. On the other hand the binding of a full agonist resulted in a significant shift of the initial equilibrium towards its active state. Finally, the binding of the inverse agonist stabilized the receptor in its inactive state. It is likely that the binding of inverse agonists might be a universal way of constitutive activity inhibition in vivo. Our results indicate that ligand binding redistribute pre-existing conformational degrees of freedom (in accordance to the Monod-Wyman-Changeux-Model) of the receptor rather than cause induced fit in it. Therefore, the ensemble of biologically relevant receptor conformations is encoded in its spatial structure, and individual conformations from that ensemble might be used by the cell in conformity with the physiological behavior.
Keywords: Seven-transmembrane receptors, constitutive activity, activation, x-ray crystallography, principal component analysis, molecular dynamics simulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 39573004 Meta Random Forests
Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti
Abstract:
Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.Keywords: Random Forests [RF], ensembles, UCI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27103003 On The Comparison of Fuzzy Logic and State Space Averaging based Sliding Control Methods Applied onan Arc Welding Machine
Authors: İres İskender, Ahmet Karaarslan
Abstract:
In this study, the performance of a high-frequency arc welding machine including a two-switch inverter is analyzed. The control of the system is achieved using two different control techniques i- fuzzy logic control (FLC) ii- state space averaging based sliding control. Fuzzy logic control does not need accurate mathematical model of a plant and can be used in nonlinear applications. The second method needs the mathematical model of the system. In this method the state space equations of the system are derived for two different “on" and “off" states of the switches. The derived state equations are combined with the sliding control rule considering the duty-cycle of the converter. The performance of the system is analyzed by simulating the system using SIMULINK tool box of MATLAB. The simulation results show that fuzzy logic controller is more robust and less sensitive to parameter variations.Keywords: Fuzzy logic, arc welding, sliding state space control, PWM, current control.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20523002 Design of an Ensemble Learning Behavior Anomaly Detection Framework
Authors: Abdoulaye Diop, Nahid Emad, Thierry Winter, Mohamed Hilia
Abstract:
Data assets protection is a crucial issue in the cybersecurity field. Companies use logical access control tools to vault their information assets and protect them against external threats, but they lack solutions to counter insider threats. Nowadays, insider threats are the most significant concern of security analysts. They are mainly individuals with legitimate access to companies information systems, which use their rights with malicious intents. In several fields, behavior anomaly detection is the method used by cyber specialists to counter the threats of user malicious activities effectively. In this paper, we present the step toward the construction of a user and entity behavior analysis framework by proposing a behavior anomaly detection model. This model combines machine learning classification techniques and graph-based methods, relying on linear algebra and parallel computing techniques. We show the utility of an ensemble learning approach in this context. We present some detection methods tests results on an representative access control dataset. The use of some explored classifiers gives results up to 99% of accuracy.Keywords: Cybersecurity, data protection, access control, insider threat, user behavior analysis, ensemble learning, high performance computing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11513001 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems
Authors: Rodolfo Lorbieski, Silvia Modesto Nassar
Abstract:
Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.Keywords: Stacking, multi-layers, ensemble, multi-class.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10933000 Impovement of a Label Extraction Method for a Risk Search System
Authors: Shigeaki Sakurai, Ryohei Orihara
Abstract:
This paper proposes an improvement method of classification efficiency in a classification model. The model is used in a risk search system and extracts specific labels from articles posted at bulletin board sites. The system can analyze the important discussions composed of the articles. The improvement method introduces ensemble learning methods that use multiple classification models. Also, it introduces expressions related to the specific labels into generation of word vectors. The paper applies the improvement method to articles collected from three bulletin board sites selected by users and verifies the effectiveness of the improvement method.Keywords: Text mining, Risk search system, Corporate reputation, Bulletin board site, Ensemble learning
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13252999 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data
Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink
Abstract:
In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.
Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22812998 A Forward Automatic Censored Cell-Averaging Detector for Multiple Target Situations in Log-Normal Clutter
Authors: Musa'ed N. Almarshad, Saleh A. Alshebeili, Mourad Barkat
Abstract:
A challenging problem in radar signal processing is to achieve reliable target detection in the presence of interferences. In this paper, we propose a novel algorithm for automatic censoring of radar interfering targets in log-normal clutter. The proposed algorithm, termed the forward automatic censored cell averaging detector (F-ACCAD), consists of two steps: removing the corrupted reference cells (censoring) and the actual detection. Both steps are performed dynamically by using a suitable set of ranked cells to estimate the unknown background level and set the adaptive thresholds accordingly. The F-ACCAD algorithm does not require any prior information about the clutter parameters nor does it require the number of interfering targets. The effectiveness of the F-ACCAD algorithm is assessed by computing, using Monte Carlo simulations, the probability of censoring and the probability of detection in different background environments.Keywords: CFAR, Log-normal clutter, Censoring, Probabilityof detection, Probability of false alarm, Probability of falsecensoring.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19162997 Outlier Pulse Detection and Feature Extraction for Wrist Pulse Analysis
Authors: Bhaskar Thakker, Anoop Lal Vyas
Abstract:
Wrist pulse analysis for identification of health status is found in Ancient Indian as well as Chinese literature. The preprocessing of wrist pulse is necessary to remove outlier pulses and fluctuations prior to the analysis of pulse pressure signal. This paper discusses the identification of irregular pulses present in the pulse series and intricacies associated with the extraction of time domain pulse features. An approach of Dynamic Time Warping (DTW) has been utilized for the identification of outlier pulses in the wrist pulse series. The ambiguity present in the identification of pulse features is resolved with the help of first derivative of Ensemble Average of wrist pulse series. An algorithm for detecting tidal and dicrotic notch in individual wrist pulse segment is proposed.Keywords: Wrist Pulse Segment, Ensemble Average, Dynamic Time Warping (DTW), Pulse Similarity Vector.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20932996 Ensembling Classifiers – An Application toImage Data Classification from Cherenkov Telescope Experiment
Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti
Abstract:
Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques with classifiers such as random forests, neural networks and support vector machines. The data sets are from MAGIC, a Cherenkov telescope experiment. The task is to classify gamma signals from overwhelmingly hadron and muon signals representing a rare class classification problem. We compare the individual classifiers with their ensemble counterparts and discuss the results. WEKA a wonderful tool for machine learning has been used for making the experiments.Keywords: Ensembles, WEKA, Neural networks [NN], SupportVector Machines [SVM], Random Forests [RF].
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17652995 Optimizing Approach for Sifting Process to Solve a Common Type of Empirical Mode Decomposition Mode Mixing
Authors: Saad Al-Baddai, Karema Al-Subari, Elmar Lang, Bernd Ludwig
Abstract:
Empirical mode decomposition (EMD), a new data-driven of time-series decomposition, has the advantage of supposing that a time series is non-linear or non-stationary, as is implicitly achieved in Fourier decomposition. However, the EMD suffers of mode mixing problem in some cases. The aim of this paper is to present a solution for a common type of signals causing of EMD mode mixing problem, in case a signal suffers of an intermittency. By an artificial example, the solution shows superior performance in terms of cope EMD mode mixing problem comparing with the conventional EMD and Ensemble Empirical Mode decomposition (EEMD). Furthermore, the over-sifting problem is also completely avoided; and computation load is reduced roughly six times compared with EEMD, an ensemble number of 50.Keywords: Empirical mode decomposition, mode mixing, sifting process, over-sifting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9922994 The Design of a Vehicle Traffic Flow Prediction Model for a Gauteng Freeway Based on an Ensemble of Multi-Layer Perceptron
Authors: Tebogo Emma Makaba, Barnabas Ndlovu Gatsheni
Abstract:
The cities of Johannesburg and Pretoria both located in the Gauteng province are separated by a distance of 58 km. The traffic queues on the Ben Schoeman freeway which connects these two cities can stretch for almost 1.5 km. Vehicle traffic congestion impacts negatively on the business and the commuter’s quality of life. The goal of this paper is to identify variables that influence the flow of traffic and to design a vehicle traffic prediction model, which will predict the traffic flow pattern in advance. The model will unable motorist to be able to make appropriate travel decisions ahead of time. The data used was collected by Mikro’s Traffic Monitoring (MTM). Multi-Layer perceptron (MLP) was used individually to construct the model and the MLP was also combined with Bagging ensemble method to training the data. The cross—validation method was used for evaluating the models. The results obtained from the techniques were compared using predictive and prediction costs. The cost was computed using combination of the loss matrix and the confusion matrix. The predicted models designed shows that the status of the traffic flow on the freeway can be predicted using the following parameters travel time, average speed, traffic volume and day of month. The implications of this work is that commuters will be able to spend less time travelling on the route and spend time with their families. The logistics industry will save more than twice what they are currently spending.Keywords: Bagging ensemble methods, confusion matrix, multi-layer perceptron, vehicle traffic flow.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17772993 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models
Authors: Yoonsuh Jung
Abstract:
As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an ‘optimal’ value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.Keywords: Cross Validation, Parameter Averaging, Parameter Selection, Regularization Parameter Search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15722992 Forecasting Fraudulent Financial Statements using Data Mining
Authors: S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas
Abstract:
This paper explores the effectiveness of machine learning techniques in detecting firms that issue fraudulent financial statements (FFS) and deals with the identification of factors associated to FFS. To this end, a number of experiments have been conducted using representative learning algorithms, which were trained using a data set of 164 fraud and non-fraud Greek firms in the recent period 2001-2002. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. To sum up, this study indicates that the investigation of financial information can be used in the identification of FFS and underline the importance of financial ratios.Keywords: Machine learning, stacking, classifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3053