Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 219

Search results for: Machine learning

219 Leveraging xAPI in a Corporate e-Learning Environment to Facilitate the Tracking, Modelling, and Predictive Analysis of Learner Behaviour

Authors: Libor Zachoval, Daire O Broin, Oisin Cawley

Abstract:

E-learning platforms, such as Blackboard have two major shortcomings: limited data capture as a result of the limitations of SCORM (Shareable Content Object Reference Model), and lack of incorporation of Artificial Intelligence (AI) and machine learning algorithms which could lead to better course adaptations. With the recent development of Experience Application Programming Interface (xAPI), a large amount of additional types of data can be captured and that opens a window of possibilities from which online education can benefit. In a corporate setting, where companies invest billions on the learning and development of their employees, some learner behaviours can be troublesome for they can hinder the knowledge development of a learner. Behaviours that hinder the knowledge development also raise ambiguity about learner’s knowledge mastery, specifically those related to gaming the system. Furthermore, a company receives little benefit from their investment if employees are passing courses without possessing the required knowledge and potential compliance risks may arise. Using xAPI and rules derived from a state-of-the-art review, we identified three learner behaviours, primarily related to guessing, in a corporate compliance course. The identified behaviours are: trying each option for a question, specifically for multiple-choice questions; selecting a single option for all the questions on the test; and continuously repeating tests upon failing as opposed to going over the learning material. These behaviours were detected on learners who repeated the test at least 4 times before passing the course. These findings suggest that gauging the mastery of a learner from multiple-choice questions test scores alone is a naive approach. Thus, next steps will consider the incorporation of additional data points, knowledge estimation models to model knowledge mastery of a learner more accurately, and analysis of the data for correlations between knowledge development and identified learner behaviours. Additional work could explore how learner behaviours could be utilised to make changes to a course. For example, course content may require modifications (certain sections of learning material may be shown to not be helpful to many learners to master the learning outcomes aimed at) or course design (such as the type and duration of feedback).

Keywords: Compliance Course, Corporate Training, Learner Behaviours, xAPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35
218 Automatic Adjustment of Thresholds via Closed-Loop Feedback Mechanism for Solder Paste Inspection

Authors: Chia-Chen Wei, Pack Hsieh, Jeffrey Chen

Abstract:

Surface Mount Technology (SMT) is widely used in the area of the electronic assembly in which the electronic components are mounted to the surface of the printed circuit board (PCB). Most of the defects in the SMT process are mainly related to the quality of solder paste printing. These defects lead to considerable manufacturing costs in the electronics assembly industry. Therefore, the solder paste inspection (SPI) machine for controlling and monitoring the amount of solder paste printing has become an important part of the production process. So far, the setting of the SPI threshold is based on statistical analysis and experts’ experiences to determine the appropriate threshold settings. Because the production data are not normal distribution and there are various variations in the production processes, defects related to solder paste printing still occur. In order to solve this problem, this paper proposes an online machine learning algorithm, called the automatic threshold adjustment (ATA) algorithm, and closed-loop architecture in the SMT process to determine the best threshold settings. Simulation experiments prove that our proposed threshold settings improve the accuracy from 99.85% to 100%.

Keywords: Big data analytics, Industry 4.0, SPI threshold setting, surface mount technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28
217 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 67
216 Fast Adjustable Threshold for Uniform Neural Network Quantization

Authors: Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Abstract:

The neural network quantization is highly desired procedure to perform before running neural networks on mobile devices. Quantization without fine-tuning leads to accuracy drop of the model, whereas commonly used training with quantization is done on the full set of the labeled data and therefore is both time- and resource-consuming. Real life applications require simplification and acceleration of quantization procedure that will maintain accuracy of full-precision neural network, especially for modern mobile neural network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present a method to significantly optimize training with quantization procedure by introducing the trained scale factors for discretization thresholds that are separate for each filter. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the set of train data of only ∼ 10% of the total ImageNet 2012 sample. Such reduction of train dataset size and small number of trainable parameters allow to fine-tune the network for several hours while maintaining the high accuracy of quantized model (accuracy drop was less than 0.5%). Ready-for-use models and code are available in the GitHub repository.

Keywords: Distillation, machine learning, neural networks, quantization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 74
215 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 97
214 Performance Evaluation of Distributed Deep Learning Frameworks in Cloud Environment

Authors: Shuen-Tai Wang, Fang-An Kuo, Chau-Yi Chou, Yu-Bin Fang

Abstract:

2016 has become the year of the Artificial Intelligence explosion. AI technologies are getting more and more matured that most world well-known tech giants are making large investment to increase the capabilities in AI. Machine learning is the science of getting computers to act without being explicitly programmed, and deep learning is a subset of machine learning that uses deep neural network to train a machine to learn  features directly from data. Deep learning realizes many machine learning applications which expand the field of AI. At the present time, deep learning frameworks have been widely deployed on servers for deep learning applications in both academia and industry. In training deep neural networks, there are many standard processes or algorithms, but the performance of different frameworks might be different. In this paper we evaluate the running performance of two state-of-the-art distributed deep learning frameworks that are running training calculation in parallel over multi GPU and multi nodes in our cloud environment. We evaluate the training performance of the frameworks with ResNet-50 convolutional neural network, and we analyze what factors that result in the performance among both distributed frameworks as well. Through the experimental analysis, we identify the overheads which could be further optimized. The main contribution is that the evaluation results provide further optimization directions in both performance tuning and algorithmic design.

Keywords: Artificial Intelligence, machine learning, deep learning, convolutional neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 193
213 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: Semantic links, data mining, linked data, SKOS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 215
212 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: Cellular automata, neural cellular automata, deep learning, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 119
211 EEG-Based Screening Tool for School Student’s Brain Disorders Using Machine Learning Algorithms

Authors: Abdelrahman A. Ramzy, Bassel S. Abdallah, Mohamed E. Bahgat, Sarah M. Abdelkader, Sherif H. ElGohary

Abstract:

Attention-Deficit/Hyperactivity Disorder (ADHD), epilepsy, and autism affect millions of children worldwide, many of which are undiagnosed despite the fact that all of these disorders are detectable in early childhood. Late diagnosis can cause severe problems due to the late treatment and to the misconceptions and lack of awareness as a whole towards these disorders. Moreover, electroencephalography (EEG) has played a vital role in the assessment of neural function in children. Therefore, quantitative EEG measurement will be utilized as a tool for use in the evaluation of patients who may have ADHD, epilepsy, and autism. We propose a screening tool that uses EEG signals and machine learning algorithms to detect these disorders at an early age in an automated manner. The proposed classifiers used with epilepsy as a step taken for the work done so far, provided an accuracy of approximately 97% using SVM, Naïve Bayes and Decision tree, while 98% using KNN, which gives hope for the work yet to be conducted.

Keywords: ADHD, autism, epilepsy, EEG, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 275
210 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: Classification, falls, health risk factors, machine learning, older adults.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 173
209 Deep Learning Based Fall Detection Using Simplified Human Posture

Authors: Kripesh Adhikari, Hamid Bouchachia, Hammadi Nait-Charif

Abstract:

Falls are one of the major causes of injury and death among elderly people aged 65 and above. A support system to identify such kind of abnormal activities have become extremely important with the increase in ageing population. Pose estimation is a challenging task and to add more to this, it is even more challenging when pose estimations are performed on challenging poses that may occur during fall. Location of the body provides a clue where the person is at the time of fall. This paper presents a vision-based tracking strategy where available joints are grouped into three different feature points depending upon the section they are located in the body. The three feature points derived from different joints combinations represents the upper region or head region, mid-region or torso and lower region or leg region. Tracking is always challenging when a motion is involved. Hence the idea is to locate the regions in the body in every frame and consider it as the tracking strategy. Grouping these joints can be beneficial to achieve a stable region for tracking. The location of the body parts provides a crucial information to distinguish normal activities from falls.

Keywords: Fall detection, machine learning, deep learning, pose estimation, tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 392
208 Predictive Semi-Empirical NOx Model for Diesel Engine

Authors: Saurabh Sharma, Yong Sun, Bruce Vernham

Abstract:

Accurate prediction of NOx emission is a continuous challenge in the field of diesel engine-out emission modeling. Performing experiments for each conditions and scenario cost significant amount of money and man hours, therefore model-based development strategy has been implemented in order to solve that issue. NOx formation is highly dependent on the burn gas temperature and the O2 concentration inside the cylinder. The current empirical models are developed by calibrating the parameters representing the engine operating conditions with respect to the measured NOx. This makes the prediction of purely empirical models limited to the region where it has been calibrated. An alternative solution to that is presented in this paper, which focus on the utilization of in-cylinder combustion parameters to form a predictive semi-empirical NOx model. The result of this work is shown by developing a fast and predictive NOx model by using the physical parameters and empirical correlation. The model is developed based on the steady state data collected at entire operating region of the engine and the predictive combustion model, which is developed in Gamma Technology (GT)-Power by using Direct Injected (DI)-Pulse combustion object. In this approach, temperature in both burned and unburnt zone is considered during the combustion period i.e. from Intake Valve Closing (IVC) to Exhaust Valve Opening (EVO). Also, the oxygen concentration consumed in burnt zone and trapped fuel mass is also considered while developing the reported model.  Several statistical methods are used to construct the model, including individual machine learning methods and ensemble machine learning methods. A detailed validation of the model on multiple diesel engines is reported in this work. Substantial numbers of cases are tested for different engine configurations over a large span of speed and load points. Different sweeps of operating conditions such as Exhaust Gas Recirculation (EGR), injection timing and Variable Valve Timing (VVT) are also considered for the validation. Model shows a very good predictability and robustness at both sea level and altitude condition with different ambient conditions. The various advantages such as high accuracy and robustness at different operating conditions, low computational time and lower number of data points requires for the calibration establishes the platform where the model-based approach can be used for the engine calibration and development process. Moreover, the focus of this work is towards establishing a framework for the future model development for other various targets such as soot, Combustion Noise Level (CNL), NO2/NOx ratio etc.

Keywords: Diesel engine, machine learning, NOx emission, semi-empirical.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 181
207 Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

Keywords: Clustering, load profiling, load modeling, machine learning, energy efficiency and quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 374
206 Comparison of Machine Learning Models for the Prediction of System Marginal Price of Greek Energy Market

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

The Greek Energy Market is structured as a mandatory pool where the producers make their bid offers in day-ahead basis. The System Operator solves an optimization routine aiming at the minimization of the cost of produced electricity. The solution of the optimization problem leads to the calculation of the System Marginal Price (SMP). Accurate forecasts of the SMP can lead to increased profits and more efficient portfolio management from the producer`s perspective. Aim of this study is to provide a comparative analysis of various machine learning models such as artificial neural networks and neuro-fuzzy models for the prediction of the SMP of the Greek market. Machine learning algorithms are favored in predictions problems since they can capture and simulate the volatilities of complex time series.

Keywords: Deregulated energy market, forecasting, machine learning, system marginal price, energy efficiency and quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 326
205 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values

Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi

Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Keywords: eXtreme Gradient Boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impairment, multiclass classification, ADNI, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 311
204 Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset

Authors: Essam Al Daoud

Abstract:

Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.

Keywords: Gradient boosting, XGBoost, LightGBM, CatBoost, home credit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 685
203 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 345
202 Dynamic Measurement System Modeling with Machine Learning Algorithms

Authors: Changqiao Wu, Guoqing Ding, Xin Chen

Abstract:

In this paper, ways of modeling dynamic measurement systems are discussed. Specially, for linear system with single-input single-output, it could be modeled with shallow neural network. Then, gradient based optimization algorithms are used for searching the proper coefficients. Besides, method with normal equation and second order gradient descent are proposed to accelerate the modeling process, and ways of better gradient estimation are discussed. It shows that the mathematical essence of the learning objective is maximum likelihood with noises under Gaussian distribution. For conventional gradient descent, the mini-batch learning and gradient with momentum contribute to faster convergence and enhance model ability. Lastly, experimental results proved the effectiveness of second order gradient descent algorithm, and indicated that optimization with normal equation was the most suitable for linear dynamic models.

Keywords: Dynamic system modeling, neural network, normal equation, second order gradient descent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 274
201 An Efficient Motion Recognition System Based on LMA Technique and a Discrete Hidden Markov Model

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Human motion recognition has been extensively increased in recent years due to its importance in a wide range of applications, such as human-computer interaction, intelligent surveillance, augmented reality, content-based video compression and retrieval, etc. However, it is still regarded as a challenging task especially in realistic scenarios. It can be seen as a general machine learning problem which requires an effective human motion representation and an efficient learning method. In this work, we introduce a descriptor based on Laban Movement Analysis technique, a formal and universal language for human movement, to capture both quantitative and qualitative aspects of movement. We use Discrete Hidden Markov Model (DHMM) for training and classification motions. We improve the classification algorithm by proposing two DHMMs for each motion class to process the motion sequence in two different directions, forward and backward. Such modification allows avoiding the misclassification that can happen when recognizing similar motions. Two experiments are conducted. In the first one, we evaluate our method on a public dataset, the Microsoft Research Cambridge-12 Kinect gesture data set (MSRC-12) which is a widely used dataset for evaluating action/gesture recognition methods. In the second experiment, we build a dataset composed of 10 gestures(Introduce yourself, waving, Dance, move, turn left, turn right, stop, sit down, increase velocity, decrease velocity) performed by 20 persons. The evaluation of the system includes testing the efficiency of our descriptor vector based on LMA with basic DHMM method and comparing the recognition results of the modified DHMM with the original one. Experiment results demonstrate that our method outperforms most of existing methods that used the MSRC-12 dataset, and a near perfect classification rate in our dataset.

Keywords: Human Motion Recognition, Motion representation, Laban Movement Analysis, Discrete Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 229
200 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: Pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 454
199 User-Based Cannibalization Mitigation in an Online Marketplace

Authors: Vivian Guo, Yan Qu

Abstract:

Online marketplaces are not only digital places where consumers buy and sell merchandise, and they are also destinations for brands to connect with real consumers at the moment when customers are in the shopping mindset. For many marketplaces, brands have been important partners through advertising. There can be, however, a risk of advertising impacting a consumer’s shopping journey if it hurts the use experience or takes the user away from the site. Both could lead to the loss of transaction revenue for the marketplace. In this paper, we present user-based methods for cannibalization control by selectively turning off ads to users who are likely to be cannibalized by ads subject to business objectives. We present ways of measuring cannibalization of advertising in the context of an online marketplace and propose novel ways of measuring cannibalization through purchase propensity and uplift modeling. A/B testing has shown that our methods can significantly improve user purchase and engagement metrics while operating within business objectives. To our knowledge, this is the first paper that addresses cannibalization mitigation at the user-level in the context of advertising.

Keywords: Cannibalization, machine learning, online marketplace, revenue optimization, yield optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 365
198 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 260
197 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, Opinion detection, SentiWordNet, trust score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 261
196 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System

Authors: J. K. Adedeji, M. O. Oyekanmi

Abstract:

This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.

Keywords: Biometric characters, facial recognition, neural network, OpenCV.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 251
195 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 274
194 Improving Similarity Search Using Clustered Data

Authors: Deokho Kim, Wonwoo Lee, Jaewoong Lee, Teresa Ng, Gun-Ill Lee, Jiwon Jeong

Abstract:

This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.

Keywords: Visual search, deep learning, convolutional neural network, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 391
193 Hand Gesture Detection via EmguCV Canny Pruning

Authors: N. N. Mosola, S. J. Molete, L. S. Masoebe, M. Letsae

Abstract:

Hand gesture recognition is a technique used to locate, detect, and recognize a hand gesture. Detection and recognition are concepts of Artificial Intelligence (AI). AI concepts are applicable in Human Computer Interaction (HCI), Expert systems (ES), etc. Hand gesture recognition can be used in sign language interpretation. Sign language is a visual communication tool. This tool is used mostly by deaf societies and those with speech disorder. Communication barriers exist when societies with speech disorder interact with others. This research aims to build a hand recognition system for Lesotho’s Sesotho and English language interpretation. The system will help to bridge the communication problems encountered by the mentioned societies. The system has various processing modules. The modules consist of a hand detection engine, image processing engine, feature extraction, and sign recognition. Detection is a process of identifying an object. The proposed system uses Canny pruning Haar and Haarcascade detection algorithms. Canny pruning implements the Canny edge detection. This is an optimal image processing algorithm. It is used to detect edges of an object. The system employs a skin detection algorithm. The skin detection performs background subtraction, computes the convex hull, and the centroid to assist in the detection process. Recognition is a process of gesture classification. Template matching classifies each hand gesture in real-time. The system was tested using various experiments. The results obtained show that time, distance, and light are factors that affect the rate of detection and ultimately recognition. Detection rate is directly proportional to the distance of the hand from the camera. Different lighting conditions were considered. The more the light intensity, the faster the detection rate. Based on the results obtained from this research, the applied methodologies are efficient and provide a plausible solution towards a light-weight, inexpensive system which can be used for sign language interpretation.

Keywords: Canny pruning, hand recognition, machine learning, skin tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 538
192 Adaption Model for Building Agile Pronunciation Dictionaries Using Phonemic Distance Measurements

Authors: Akella Amarendra Babu, Rama Devi Yellasiri, Natukula Sainath

Abstract:

Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.

Keywords: Pronunciation variations, dynamic programming, machine learning, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 364
191 A Comprehensive Evaluation of Supervised Machine Learning for the Phase Identification Problem

Authors: Brandon Foggo, Nanpeng Yu

Abstract:

Power distribution circuits undergo frequent network topology changes that are often left undocumented. As a result, the documentation of a circuit’s connectivity becomes inaccurate with time. The lack of reliable circuit connectivity information is one of the biggest obstacles to model, monitor, and control modern distribution systems. To enhance the reliability and efficiency of electric power distribution systems, the circuit’s connectivity information must be updated periodically. This paper focuses on one critical component of a distribution circuit’s topology - the secondary transformer to phase association. This topology component describes the set of phase lines that feed power to a given secondary transformer (and therefore a given group of power consumers). Finding the documentation of this component is call Phase Identification, and is typically performed with physical measurements. These measurements can take time lengths on the order of several months, but with supervised learning, the time length can be reduced significantly. This paper compares several such methods applied to Phase Identification for a large range of real distribution circuits, describes a method of training data selection, describes preprocessing steps unique to the Phase Identification problem, and ultimately describes a method which obtains high accuracy (> 96% in most cases, > 92% in the worst case) using only 5% of the measurements typically used for Phase Identification.

Keywords: Distribution network, machine learning, network topology, phase identification, smart grid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 401
190 Tools for Analysis and Optimization of Standalone Green Microgrids

Authors: William Anderson, Kyle Kobold, Oleg Yakimenko

Abstract:

Green microgrids using mostly renewable energy (RE) for generation, are complex systems with inherent nonlinear dynamics. Among a variety of different optimization tools there are only a few ones that adequately consider this complexity. This paper evaluates applicability of two somewhat similar optimization tools tailored for standalone RE microgrids and also assesses a machine learning tool for performance prediction that can enhance the reliability of any chosen optimization tool. It shows that one of these microgrid optimization tools has certain advantages over another and presents a detailed routine of preparing input data to simulate RE microgrid behavior. The paper also shows how neural-network-based predictive modeling can be used to validate and forecast solar power generation based on weather time series data, which improves the overall quality of standalone RE microgrid analysis.

Keywords: Microgrid, renewable energy, complex systems, optimization, predictive modeling, neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 435