Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 15437

Search results for: Bayes method

15437 An Estimating Parameter of the Mean in Normal Distribution by Maximum Likelihood, Bayes, and Markov Chain Monte Carlo Methods

Authors: Autcha Araveeporn

Abstract:

This paper is to compare the parameter estimation of the mean in normal distribution by Maximum Likelihood (ML), Bayes, and Markov Chain Monte Carlo (MCMC) methods. The ML estimator is estimated by the average of data, the Bayes method is considered from the prior distribution to estimate Bayes estimator, and MCMC estimator is approximated by Gibbs sampling from posterior distribution. These methods are also to estimate a parameter then the hypothesis testing is used to check a robustness of the estimators. Data are simulated from normal distribution with the true parameter of mean 2, and variance 4, 9, and 16 when the sample sizes is set as 10, 20, 30, and 50. From the results, it can be seen that the estimation of MLE, and MCMC are perceivably different from the true parameter when the sample size is 10 and 20 with variance 16. Furthermore, the Bayes estimator is estimated from the prior distribution when mean is 1, and variance is 12 which showed the significant difference in mean with variance 9 at the sample size 10 and 20.

Keywords: Bayes method, Markov chain Monte Carlo method, maximum likelihood method, normal distribution

Procedia PDF Downloads 273
15436 Modified Naive Bayes-Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: tomato yield prediction, naive Bayes, redundancy, WSG

Procedia PDF Downloads 163
15435 Comparing SVM and Naïve Bayes Classifier for Automatic Microaneurysm Detections

Authors: A. Sopharak, B. Uyyanonvara, S. Barman

Abstract:

Diabetic retinopathy is characterized by the development of retinal microaneurysms. The damage can be prevented if disease is treated in its early stages. In this paper, we are comparing Support Vector Machine (SVM) and Naïve Bayes (NB) classifiers for automatic microaneurysm detection in images acquired through non-dilated pupils. The Nearest Neighbor classifier is used as a baseline for comparison. Detected microaneurysms are validated with expert ophthalmologists’ hand-drawn ground-truths. The sensitivity, specificity, precision and accuracy of each method are also compared.

Keywords: diabetic retinopathy, microaneurysm, naive Bayes classifier, SVM classifier

Procedia PDF Downloads 253
15434 Reducing Crash Risk at Intersections with Safety Improvements

Authors: Upal Barua

Abstract:

Crash risk at intersections is a critical safety issue. This paper examines the effectiveness of removing an existing off-set at an intersection by realignment, in reducing crashes. Empirical Bayes method was applied to conduct a before-and-after study to assess the effect of this safety improvement. The Transportation Safety Improvement Program in Austin Transportation Department completed several safety improvement projects at high crash intersections with a view to reducing crashes. One of the common safety improvement techniques applied was the realignment of intersection approaches removing an existing off-set. This paper illustrates how this safety improvement technique is applied at a high crash intersection from inception to completion. This paper also highlights the significant crash reductions achieved from this safety improvement technique applying Empirical Bayes method in a before-and-after study. The result showed that realignment of intersection approaches removing an existing off-set can reduce crashes by 53%. This paper also features the state of the art techniques applied in planning, engineering, designing and construction of this safety improvement, key factors driving the success, and lessons learned in the process.

Keywords: crash risk, intersection, off-set, safety improvement technique, before-and-after study, empirical Bayes method

Procedia PDF Downloads 147
15433 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 239
15432 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: classification, data mining, spam filtering, naive bayes, decision tree

Procedia PDF Downloads 322
15431 New Segmentation of Piecewise Linear Regression Models Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation of piecewise linear regression models. The method used to estimate the parameters of picewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters of picewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.

Keywords: regression, piecewise, Bayesian, reversible Jump MCMC

Procedia PDF Downloads 416
15430 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 291
15429 Features Dimensionality Reduction and Multi-Dimensional Voice-Processing Program to Parkinson Disease Discrimination

Authors: Djamila Meghraoui, Bachir Boudraa, Thouraya Meksen, M.Boudraa

Abstract:

Parkinson's disease is a pathology that involves characteristic perturbations in patients’ voices. This paper describes a proposed method that aims to diagnose persons with Parkinson (PWP) by analyzing on line their voices signals. First, Thresholds signals alterations are determined by the Multi-Dimensional Voice Program (MDVP). Principal Analysis (PCA) is exploited to select the main voice principal componentsthat are significantly affected in a patient. The decision phase is realized by a Mul-tinomial Bayes (MNB) Classifier that categorizes an analyzed voice in one of the two resulting classes: healthy or PWP. The prediction accuracy achieved reaching 98.8% is very promising.

Keywords: Parkinson’s disease recognition, PCA, MDVP, multinomial Naive Bayes

Procedia PDF Downloads 193
15428 Safety Effect of Smart Right-Turn Design at Intersections

Authors: Upal Barua

Abstract:

The risk of severe crashes at high-speed right-turns at intersections is a major safety concern these days. The application of a smart right-turn at an intersection is increasing day by day to address is an issue. The design, ‘Smart Right-turn’ consists of a narrow-angle of channelization at approximately 70°. This design increases the cone of vision of the right-tuning drivers towards the crossing pedestrians as well as traffic on the cross-road. As part of the Safety Improvement Program in Austin Transportation Department, several smart right-turns were constructed at high crash intersections where high-speed right-turns were found to be a contributing factor. This paper features the state of the art techniques applied in planning, engineering, designing and construction of this smart right-turn, key factors driving the success, and lessons learned in the process. This paper also presents the significant crash reductions achieved from the application of this smart right-turn design using Empirical Bayes method. The result showed that smart right-turns can reduce overall right-turn crashes by 43% and severe right-turn crashes by 70%.

Keywords: smart right-turn, intersection, cone of vision, empirical Bayes method

Procedia PDF Downloads 157
15427 Bayes Estimation of Parameters of Binomial Type Rayleigh Class Software Reliability Growth Model using Non-informative Priors

Authors: Rajesh Singh, Kailash Kale

Abstract:

In this paper, the Binomial process type occurrence of software failures is considered and failure intensity has been characterized by one parameter Rayleigh class Software Reliability Growth Model (SRGM). The proposed SRGM is mathematical function of parameters namely; total number of failures i.e. η-0 and scale parameter i.e. η-1. It is assumed that very little or no information is available about both these parameters and then considering non-informative priors for both these parameters, the Bayes estimators for the parameters η-0 and η-1 have been obtained under square error loss function. The proposed Bayes estimators are compared with their corresponding maximum likelihood estimators on the basis of risk efficiencies obtained by Monte Carlo simulation technique. It is concluded that both the proposed Bayes estimators of total number of failures and scale parameter perform well for proper choice of execution time.

Keywords: binomial process, non-informative prior, maximum likelihood estimator (MLE), rayleigh class, software reliability growth model (SRGM)

Procedia PDF Downloads 298
15426 Random Access in IoT Using Naïve Bayes Classification

Authors: Alhusein Almahjoub, Dongyu Qiu

Abstract:

This paper deals with the random access procedure in next-generation networks and presents the solution to reduce total service time (TST) which is one of the most important performance metrics in current and future internet of things (IoT) based networks. The proposed solution focuses on the calculation of optimal transmission probability which maximizes the success probability and reduces TST. It uses the information of several idle preambles in every time slot, and based on it, it estimates the number of backlogged IoT devices using Naïve Bayes estimation which is a type of supervised learning in the machine learning domain. The estimation of backlogged devices is necessary since optimal transmission probability depends on it and the eNodeB does not have information about it. The simulations are carried out in MATLAB which verify that the proposed solution gives excellent performance.

Keywords: random access, LTE/LTE-A, 5G, machine learning, Naïve Bayes estimation

Procedia PDF Downloads 66
15425 Machine Learning Predictive Models for Hydroponic Systems: A Case Study Nutrient Film Technique and Deep Flow Technique

Authors: Kritiyaporn Kunsook

Abstract:

Machine learning algorithms (MLAs) such us artificial neural networks (ANNs), decision tree, support vector machines (SVMs), Naïve Bayes, and ensemble classifier by voting are powerful data driven methods that are relatively less widely used in the mapping of technique of system, and thus have not been comparatively evaluated together thoroughly in this field. The performances of a series of MLAs, ANNs, decision tree, SVMs, Naïve Bayes, and ensemble classifier by voting in technique of hydroponic systems prospectively modeling are compared based on the accuracy of each model. Classification of hydroponic systems only covers the test samples from vegetables grown with Nutrient film technique (NFT) and Deep flow technique (DFT). The feature, which are the characteristics of vegetables compose harvesting height width, temperature, require light and color. The results indicate that the classification performance of the ANNs is 98%, decision tree is 98%, SVMs is 97.33%, Naïve Bayes is 96.67%, and ensemble classifier by voting is 98.96% algorithm respectively.

Keywords: artificial neural networks, decision tree, support vector machines, naïve Bayes, ensemble classifier by voting

Procedia PDF Downloads 223
15424 Design and Implementation of Generative Models for Odor Classification Using Electronic Nose

Authors: Kumar Shashvat, Amol P. Bhondekar

Abstract:

In the midst of the five senses, odor is the most reminiscent and least understood. Odor testing has been mysterious and odor data fabled to most practitioners. The delinquent of recognition and classification of odor is important to achieve. The facility to smell and predict whether the artifact is of further use or it has become undesirable for consumption; the imitation of this problem hooked on a model is of consideration. The general industrial standard for this classification is color based anyhow; odor can be improved classifier than color based classification and if incorporated in machine will be awfully constructive. For cataloging of odor for peas, trees and cashews various discriminative approaches have been used Discriminative approaches offer good prognostic performance and have been widely used in many applications but are incapable to make effectual use of the unlabeled information. In such scenarios, generative approaches have better applicability, as they are able to knob glitches, such as in set-ups where variability in the series of possible input vectors is enormous. Generative models are integrated in machine learning for either modeling data directly or as a transitional step to form an indeterminate probability density function. The algorithms or models Linear Discriminant Analysis and Naive Bayes Classifier have been used for classification of the odor of cashews. Linear Discriminant Analysis is a method used in data classification, pattern recognition, and machine learning to discover a linear combination of features that typifies or divides two or more classes of objects or procedures. The Naive Bayes algorithm is a classification approach base on Bayes rule and a set of qualified independence theory. Naive Bayes classifiers are highly scalable, requiring a number of restraints linear in the number of variables (features/predictors) in a learning predicament. The main recompenses of using the generative models are generally a Generative Models make stronger assumptions about the data, specifically, about the distribution of predictors given the response variables. The Electronic instrument which is used for artificial odor sensing and classification is an electronic nose. This device is designed to imitate the anthropological sense of odor by providing an analysis of individual chemicals or chemical mixtures. The experimental results have been evaluated in the form of the performance measures i.e. are accuracy, precision and recall. The investigational results have proven that the overall performance of the Linear Discriminant Analysis was better in assessment to the Naive Bayes Classifier on cashew dataset.

Keywords: odor classification, generative models, naive bayes, linear discriminant analysis

Procedia PDF Downloads 301
15423 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 80
15422 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 270
15421 Feature Extraction and Classification Based on the Bayes Test for Minimum Error

Authors: Nasar Aldian Ambark Shashoa

Abstract:

Classification with a dimension reduction based on Bayesian approach is proposed in this paper . The first step is to generate a sample (parameter) of fault-free mode class and faulty mode class. The second, in order to obtain good classification performance, a selection of important features is done with the discrete karhunen-loeve expansion. Next, the Bayes test for minimum error is used to classify the classes. Finally, the results for simulated data demonstrate the capabilities of the proposed procedure.

Keywords: analytical redundancy, fault detection, feature extraction, Bayesian approach

Procedia PDF Downloads 456
15420 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 60
15419 Bayesian Estimation under Different Loss Functions Using Gamma Prior for the Case of Exponential Distribution

Authors: Md. Rashidul Hasan, Atikur Rahman Baizid

Abstract:

The Bayesian estimation approach is a non-classical estimation technique in statistical inference and is very useful in real world situation. The aim of this paper is to study the Bayes estimators of the parameter of exponential distribution under different loss functions and then compared among them as well as with the classical estimator named maximum likelihood estimator (MLE). In our real life, we always try to minimize the loss and we also want to gather some prior information (distribution) about the problem to solve it accurately. Here the gamma prior is used as the prior distribution of exponential distribution for finding the Bayes estimator. In our study, we also used different symmetric and asymmetric loss functions such as squared error loss function, quadratic loss function, modified linear exponential (MLINEX) loss function and non-linear exponential (NLINEX) loss function. Finally, mean square error (MSE) of the estimators are obtained and then presented graphically.

Keywords: Bayes estimator, maximum likelihood estimator (MLE), modified linear exponential (MLINEX) loss function, Squared Error (SE) loss function, non-linear exponential (NLINEX) loss function

Procedia PDF Downloads 267
15418 Health Status Monitoring of COVID-19 Patient's through Blood Tests and Naïve-Bayes

Authors: Carlos Arias-Alcaide, Cristina Soguero-Ruiz, Paloma Santos-Álvarez, Adrián García-Romero, Inmaculada Mora-Jiménez

Abstract:

Analysing clinical data with computers in such a way that have an impact on the practitioners’ workflow is a challenge nowadays. This paper provides a first approach for monitoring the health status of COVID-19 patients through the use of some biomarkers (blood tests) and the simplest Naïve Bayes classifier. Data of two Spanish hospitals were considered, showing the potential of our approach to estimate reasonable posterior probabilities even some days before the event.

Keywords: Bayesian model, blood biomarkers, classification, health tracing, machine learning, posterior probability

Procedia PDF Downloads 62
15417 Early Gastric Cancer Prediction from Diet and Epidemiological Data Using Machine Learning in Mizoram Population

Authors: Brindha Senthil Kumar, Payel Chakraborty, Senthil Kumar Nachimuthu, Arindam Maitra, Prem Nath

Abstract:

Gastric cancer is predominantly caused by demographic and diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (ECG) from diet and lifestyle factors using supervised machine learning algorithms. For this study, 160 healthy individual and 80 cases were selected who had been followed for 3 years (2016-2019), at Civil Hospital, Aizawl, Mizoram. A dataset containing 11 features that are core risk factors for the gastric cancer were extracted. Supervised machine algorithms: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Multilayer perceptron, and Random Forest were used to analyze the dataset using Python Jupyter Notebook Version 3. The obtained classified results had been evaluated using metrics parameters: minimum_false_positives, brier_score, accuracy, precision, recall, F1_score, and Receiver Operating Characteristics (ROC) curve. Data analysis results showed Naive Bayes - 88, 0.11; Random Forest - 83, 0.16; SVM - 77, 0.22; Logistic Regression - 75, 0.25 and Multilayer perceptron - 72, 0.27 with respect to accuracy and brier_score in percent. Naive Bayes algorithm out performs with very low false positive rates as well as brier_score and good accuracy. Naive Bayes algorithm classification results in predicting ECG showed very satisfactory results using only diet cum lifestyle factors which will be very helpful for the physicians to educate the patients and public, thereby mortality of gastric cancer can be reduced/avoided with this knowledge mining work.

Keywords: Early Gastric cancer, Machine Learning, Diet, Lifestyle Characteristics

Procedia PDF Downloads 56
15416 A Supervised Approach for Detection of Singleton Spam Reviews

Authors: Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim

Abstract:

In recent years, we have witnessed that online reviews are the most important source of customers’ opinion. They are progressively more used by individuals and organisations to make purchase and business decisions. Unfortunately, for the reason of profit or fame, frauds produce deceptive reviews to hoodwink potential customers. Their activities mislead not only potential customers to make appropriate purchasing decisions and organisations to reshape their business, but also opinion mining techniques by preventing them from reaching accurate results. Spam reviews could be divided into two main groups, i.e. multiple and singleton spam reviews. Detecting a singleton spam review that is the only review written by a user ID is extremely challenging due to lack of clue for detection purposes. Singleton spam reviews are very harmful and various features and proofs used in multiple spam reviews detection are not applicable in this case. Current research aims to propose a novel supervised technique to detect singleton spam reviews. To achieve this, various features are proposed in this study and are to be combined with the most appropriate features extracted from literature and employed in a classifier. In order to compare the performance of different classifiers, SVM and naive Bayes classification algorithms were used for model building. The results revealed that SVM was more accurate than naive Bayes and our proposed technique is capable to detect singleton spam reviews effectively.

Keywords: classification algorithms, Naïve Bayes, opinion review spam detection, singleton review spam detection, support vector machine

Procedia PDF Downloads 214
15415 Detection and Classification of Myocardial Infarction Using New Extracted Features from Standard 12-Lead ECG Signals

Authors: Naser Safdarian, Nader Jafarnia Dabanloo

Abstract:

In this paper we used four features i.e. Q-wave integral, QRS complex integral, T-wave integral and total integral as extracted feature from normal and patient ECG signals to detection and localization of myocardial infarction (MI) in left ventricle of heart. In our research we focused on detection and localization of MI in standard ECG. We use the Q-wave integral and T-wave integral because this feature is important impression in detection of MI. We used some pattern recognition method such as Artificial Neural Network (ANN) to detect and localize the MI. Because these methods have good accuracy for classification of normal and abnormal signals. We used one type of Radial Basis Function (RBF) that called Probabilistic Neural Network (PNN) because of its nonlinearity property, and used other classifier such as k-Nearest Neighbors (KNN), Multilayer Perceptron (MLP) and Naive Bayes Classification. We used PhysioNet database as our training and test data. We reached over 80% for accuracy in test data for localization and over 95% for detection of MI. Main advantages of our method are simplicity and its good accuracy. Also we can improve accuracy of classification by adding more features in this method. A simple method based on using only four features which extracted from standard ECG is presented which has good accuracy in MI localization.

Keywords: ECG signal processing, myocardial infarction, features extraction, pattern recognition

Procedia PDF Downloads 342
15414 An Integrated Lightweight Naïve Bayes Based Webpage Classification Service for Smartphone Browsers

Authors: Mayank Gupta, Siba Prasad Samal, Vasu Kakkirala

Abstract:

The internet world and its priorities have changed considerably in the last decade. Browsing on smart phones has increased manifold and is set to explode much more. Users spent considerable time browsing different websites, that gives a great deal of insight into user’s preferences. Instead of plain information classifying different aspects of browsing like Bookmarks, History, and Download Manager into useful categories would improve and enhance the user’s experience. Most of the classification solutions are server side that involves maintaining server and other heavy resources. It has security constraints and maybe misses on contextual data during classification. On device, classification solves many such problems, but the challenge is to achieve accuracy on classification with resource constraints. This on device classification can be much more useful in personalization, reducing dependency on cloud connectivity and better privacy/security. This approach provides more relevant results as compared to current standalone solutions because it uses content rendered by browser which is customized by the content provider based on user’s profile. This paper proposes a Naive Bayes based lightweight classification engine targeted for a resource constraint devices. Our solution integrates with Web Browser that in turn triggers classification algorithm. Whenever a user browses a webpage, this solution extracts DOM Tree data from the browser’s rendering engine. This DOM data is a dynamic, contextual and secure data that can’t be replicated. This proposal extracts different features of the webpage that runs on an algorithm to classify into multiple categories. Naive Bayes based engine is chosen in this solution for its inherent advantages in using limited resources compared to other classification algorithms like Support Vector Machine, Neural Networks, etc. Naive Bayes classification requires small memory footprint and less computation suitable for smartphone environment. This solution has a feature to partition the model into multiple chunks that in turn will facilitate less usage of memory instead of loading a complete model. Classification of the webpages done through integrated engine is faster, more relevant and energy efficient than other standalone on device solution. This classification engine has been tested on Samsung Z3 Tizen hardware. The Engine is integrated into Tizen Browser that uses Chromium Rendering Engine. For this solution, extensive dataset is sourced from dmoztools.net and cleaned. This cleaned dataset has 227.5K webpages which are divided into 8 generic categories ('education', 'games', 'health', 'entertainment', 'news', 'shopping', 'sports', 'travel'). Our browser integrated solution has resulted in 15% less memory usage (due to partition method) and 24% less power consumption in comparison with standalone solution. This solution considered 70% of the dataset for training the data model and the rest 30% dataset for testing. An average accuracy of ~96.3% is achieved across the above mentioned 8 categories. This engine can be further extended for suggesting Dynamic tags and using the classification for differential uses cases to enhance browsing experience.

Keywords: chromium, lightweight engine, mobile computing, Naive Bayes, Tizen, web browser, webpage classification

Procedia PDF Downloads 83
15413 Estimation of Stress-Strength Parameter for Burr Type XII Distribution Based on Progressive Type-II Censoring

Authors: A. M. Abd-Elfattah, M. H. Abu-Moussa

Abstract:

In this paper, the estimation of stress-strength parameter R = P(Y < X) is considered when X; Y the strength and stress respectively are two independent random variables of Burr Type XII distribution. The samples taken for X and Y are progressively censoring of type II. The maximum likelihood estimator (MLE) of R is obtained when the common parameter is unknown. But when the common parameter is known the MLE, uniformly minimum variance unbiased estimator (UMVUE) and the Bayes estimator of R = P(Y < X) are obtained. The exact con dence interval of R based on MLE is obtained. The performance of the proposed estimators is compared using the computer simulation.

Keywords: Burr Type XII distribution, progressive type-II censoring, stress-strength model, unbiased estimator, maximum-likelihood estimator, uniformly minimum variance unbiased estimator, confidence intervals, Bayes estimator

Procedia PDF Downloads 366
15412 A Study of Permission-Based Malware Detection Using Machine Learning

Authors: Ratun Rahman, Rafid Islam, Akin Ahmed, Kamrul Hasan, Hasan Mahmud

Abstract:

Malware is becoming more prevalent, and several threat categories have risen dramatically in recent years. This paper provides a bird's-eye view of the world of malware analysis. The efficiency of five different machine learning methods (Naive Bayes, K-Nearest Neighbor, Decision Tree, Random Forest, and TensorFlow Decision Forest) combined with features picked from the retrieval of Android permissions to categorize applications as harmful or benign is investigated in this study. The test set consists of 1,168 samples (among these android applications, 602 are malware and 566 are benign applications), each consisting of 948 features (permissions). Using the permission-based dataset, the machine learning algorithms then produce accuracy rates above 80%, except the Naive Bayes Algorithm with 65% accuracy. Of the considered algorithms TensorFlow Decision Forest performed the best with an accuracy of 90%.

Keywords: android malware detection, machine learning, malware, malware analysis

Procedia PDF Downloads 12
15411 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 206
15410 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 57
15409 Statistical Analysis with Prediction Models of User Satisfaction in Software Project Factors

Authors: Katawut Kaewbanjong

Abstract:

We analyzed a volume of data and found significant user satisfaction in software project factors. A statistical significance analysis (logistic regression) and collinearity analysis determined the significance factors from a group of 71 pre-defined factors from 191 software projects in ISBSG Release 12. The eight prediction models used for testing the prediction potential of these factors were Neural network, k-NN, Naïve Bayes, Random forest, Decision tree, Gradient boosted tree, linear regression and logistic regression prediction model. Fifteen pre-defined factors were truly significant in predicting user satisfaction, and they provided 82.71% prediction accuracy when used with a neural network prediction model. These factors were client-server, personnel changes, total defects delivered, project inactive time, industry sector, application type, development type, how methodology was acquired, development techniques, decision making process, intended market, size estimate approach, size estimate method, cost recording method, and effort estimate method. These findings may benefit software development managers considerably.

Keywords: prediction model, statistical analysis, software project, user satisfaction factor

Procedia PDF Downloads 48
15408 Applied Complement of Probability and Information Entropy for Prediction in Student Learning

Authors: Kennedy Efosa Ehimwenma, Sujatha Krishnamoorthy, Safiya Al‑Sharji

Abstract:

The probability computation of events is in the interval of [0, 1], which are values that are determined by the number of outcomes of events in a sample space S. The probability Pr(A) that an event A will never occur is 0. The probability Pr(B) that event B will certainly occur is 1. This makes both events A and B a certainty. Furthermore, the sum of probabilities Pr(E₁) + Pr(E₂) + … + Pr(Eₙ) of a finite set of events in a given sample space S equals 1. Conversely, the difference of the sum of two probabilities that will certainly occur is 0. This paper first discusses Bayes, the complement of probability, and the difference of probability for occurrences of learning-events before applying them in the prediction of learning objects in student learning. Given the sum of 1; to make a recommendation for student learning, this paper proposes that the difference of argMaxPr(S) and the probability of student-performance quantifies the weight of learning objects for students. Using a dataset of skill-set, the computational procedure demonstrates i) the probability of skill-set events that have occurred that would lead to higher-level learning; ii) the probability of the events that have not occurred that requires subject-matter relearning; iii) accuracy of the decision tree in the prediction of student performance into class labels and iv) information entropy about skill-set data and its implication on student cognitive performance and recommendation of learning.

Keywords: complement of probability, Bayes’ rule, prediction, pre-assessments, computational education, information theory

Procedia PDF Downloads 64