Search results for: machine learning techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13520

Search results for: machine learning techniques

13520 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 402
13519 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments

Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard

Abstract:

With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.

Keywords: activities of daily living, classification, internet of things, machine learning, prediction, smart home

Procedia PDF Downloads 307
13518 Optimize Data Evaluation Metrics for Fraud Detection Using Machine Learning

Authors: Jennifer Leach, Umashanger Thayasivam

Abstract:

The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, though, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate people. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease this advancement. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent data, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which testing split and technique would lead to the most optimal results.

Keywords: data science, fraud detection, machine learning, supervised learning

Procedia PDF Downloads 156
13517 Machine Learning Techniques to Develop Traffic Accident Frequency Prediction Models

Authors: Rodrigo Aguiar, Adelino Ferreira

Abstract:

Road traffic accidents are the leading cause of unnatural death and injuries worldwide, representing a significant problem of road safety. In this context, the use of artificial intelligence with advanced machine learning techniques has gained prominence as a promising approach to predict traffic accidents. This article investigates the application of machine learning algorithms to develop traffic accident frequency prediction models. Models are evaluated based on performance metrics, making it possible to do a comparative analysis with traditional prediction approaches. The results suggest that machine learning can provide a powerful tool for accident prediction, which will contribute to making more informed decisions regarding road safety.

Keywords: machine learning, artificial intelligence, frequency of accidents, road safety

Procedia PDF Downloads 43
13516 The Accuracy of Parkinson's Disease Diagnosis Using [123I]-FP-CIT Brain SPECT Data with Machine Learning Techniques: A Survey

Authors: Lavanya Madhuri Bollipo, K. V. Kadambari

Abstract:

Objective: To discuss key issues in the diagnosis of Parkinson disease (PD), To discuss features influencing PD progression, To discuss importance of brain SPECT data in PD diagnosis, and To discuss the essentiality of machine learning techniques in early diagnosis of PD. An accurate and early diagnosis of PD is nowadays a challenge as clinical symptoms in PD arise only when there is more than 60% loss of dopaminergic neurons. So far there are no laboratory tests for the diagnosis of PD, causing a high rate of misdiagnosis especially when the disease is in the early stages. Recent neuroimaging studies with brain SPECT using 123I-Ioflupane (DaTSCAN) as radiotracer shown to be widely used to assist the diagnosis of PD even in its early stages. Machine learning techniques can be used in combination with image analysis procedures to develop computer-aided diagnosis (CAD) systems for PD. This paper addressed recent studies involving diagnosis of PD in its early stages using brain SPECT data with Machine Learning Techniques.

Keywords: Parkinson disease (PD), dopamine transporter, single-photon emission computed tomography (SPECT), support vector machine (SVM)

Procedia PDF Downloads 356
13515 Stock Movement Prediction Using Price Factor and Deep Learning

Authors: Hy Dang, Bo Mei

Abstract:

The development of machine learning methods and techniques has opened doors for investigation in many areas such as medicines, economics, finance, etc. One active research area involving machine learning is stock market prediction. This research paper tries to consider multiple techniques and methods for stock movement prediction using historical price or price factors. The paper explores the effectiveness of some deep learning frameworks for forecasting stock. Moreover, an architecture (TimeStock) is proposed which takes the representation of time into account apart from the price information itself. Our model achieves a promising result that shows a potential approach for the stock movement prediction problem.

Keywords: classification, machine learning, time representation, stock prediction

Procedia PDF Downloads 106
13514 A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning

Authors: Samina Khalid, Shamila Nasreen

Abstract:

Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field of machine learning and pattern recognition, dimensionality reduction is important area, where many approaches have been proposed. In this paper, some widely used feature selection and feature extraction techniques have analyzed with the purpose of how effectively these techniques can be used to achieve high performance of learning algorithms that ultimately improves predictive accuracy of classifier. An endeavor to analyze dimensionality reduction techniques briefly with the purpose to investigate strengths and weaknesses of some widely used dimensionality reduction methods is presented.

Keywords: age related macular degeneration, feature selection feature subset selection feature extraction/transformation, FSA’s, relief, correlation based method, PCA, ICA

Procedia PDF Downloads 445
13513 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 55
13512 Evaluating the Implementation of Machine Learning Techniques in the South African Built Environment

Authors: Peter Adekunle, Clinton Aigbavboa, Matthew Ikuabe, Opeoluwa Akinradewo

Abstract:

The future of machine learning (ML) in building may seem like a distant idea that will take decades to materialize, but it is actually far closer than previously believed. In reality, the built environment has been progressively increasing interest in machine learning. Although it could appear to be a very technical, impersonal approach, it can really make things more personable. Instead of eliminating humans out of the equation, machine learning allows people do their real work more efficiently. It is therefore vital to evaluate the factors influencing the implementation and challenges of implementing machine learning techniques in the South African built environment. The study's design was one of a survey. In South Africa, construction workers and professionals were given a total of one hundred fifty (150) questionnaires, of which one hundred and twenty-four (124) were returned and deemed eligible for study. Utilizing percentage, mean item scores, standard deviation, and Kruskal-Wallis, the collected data was analyzed. The results demonstrate that the top factors influencing the adoption of machine learning are knowledge level and a lack of understanding of its potential benefits. While lack of collaboration among stakeholders and lack of tools and services are the key hurdles to the deployment of machine learning within the South African built environment. The study came to the conclusion that ML adoption should be promoted in order to increase safety, productivity, and service quality within the built environment.

Keywords: machine learning, implementation, built environment, construction stakeholders

Procedia PDF Downloads 98
13511 Musical Instruments Classification Using Machine Learning Techniques

Authors: Bhalke D. G., Bormane D. S., Kharate G. K.

Abstract:

This paper presents classification of musical instrument using machine learning techniques. The classification has been carried out using temporal, spectral, cepstral and wavelet features. Detail feature analysis is carried out using separate and combined features. Further, instrument model has been developed using K-Nearest Neighbor and Support Vector Machine (SVM). Benchmarked McGill university database has been used to test the performance of the system. Experimental result shows that SVM performs better as compared to KNN classifier.

Keywords: feature extraction, SVM, KNN, musical instruments

Procedia PDF Downloads 451
13510 Detect QOS Attacks Using Machine Learning Algorithm

Authors: Christodoulou Christos, Politis Anastasios

Abstract:

A large majority of users favoured to wireless LAN connection since it was so simple to use. A wireless network can be the target of numerous attacks. Class hijacking is a well-known attack that is fairly simple to execute and has significant repercussions on users. The statistical flow analysis based on machine learning (ML) techniques is a promising categorization methodology. In a given dataset, which in the context of this paper is a collection of components representing frames belonging to various flows, machine learning (ML) can offer a technique for identifying and characterizing structural patterns. It is possible to classify individual packets using these patterns. It is possible to identify fraudulent conduct, such as class hijacking, and take necessary action as a result. In this study, we explore a way to use machine learning approaches to thwart this attack.

Keywords: wireless lan, quality of service, machine learning, class hijacking, EDCA remapping

Procedia PDF Downloads 14
13509 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 180
13508 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 208
13507 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 422
13506 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 240
13505 Flood-prone Urban Area Mapping Using Machine Learning, a Case Sudy of M'sila City (Algeria)

Authors: Medjadj Tarek, Ghribi Hayet

Abstract:

This study aims to develop a flood sensitivity assessment tool using machine learning (ML) techniques and geographic information system (GIS). The importance of this study is integrating the geographic information systems (GIS) and machine learning (ML) techniques for mapping flood risks, which help decision-makers to identify the most vulnerable areas and take the necessary precautions to face this type of natural disaster. To reach this goal, we will study the case of the city of M'sila, which is among the areas most vulnerable to floods. This study drew a map of flood-prone areas based on the methodology where we have made a comparison between 3 machine learning algorithms: the xGboost model, the Random Forest algorithm and the K Nearest Neighbour algorithm. Each of them gave an accuracy respectively of 97.92 - 95 - 93.75. In the process of mapping flood-prone areas, the first model was relied upon, which gave the greatest accuracy (xGboost).

Keywords: Geographic information systems (GIS), machine learning (ML), emergency mapping, flood disaster management

Procedia PDF Downloads 53
13504 Hybrid Model: An Integration of Machine Learning with Traditional Scorecards

Authors: Golnush Masghati-Amoli, Paul Chin

Abstract:

Over the past recent years, with the rapid increases in data availability and computing power, Machine Learning (ML) techniques have been called on in a range of different industries for their strong predictive capability. However, the use of Machine Learning in commercial banking has been limited due to a special challenge imposed by numerous regulations that require lenders to be able to explain their analytic models, not only to regulators but often to consumers. In other words, although Machine Leaning techniques enable better prediction with a higher level of accuracy, in comparison with other industries, they are adopted less frequently in commercial banking especially for scoring purposes. This is due to the fact that Machine Learning techniques are often considered as a black box and fail to provide information on why a certain risk score is given to a customer. In order to bridge this gap between the explain-ability and performance of Machine Learning techniques, a Hybrid Model is developed at Dun and Bradstreet that is focused on blending Machine Learning algorithms with traditional approaches such as scorecards. The Hybrid Model maximizes efficiency of traditional scorecards by merging its practical benefits, such as explain-ability and the ability to input domain knowledge, with the deep insights of Machine Learning techniques which can uncover patterns scorecard approaches cannot. First, through development of Machine Learning models, engineered features and latent variables and feature interactions that demonstrate high information value in the prediction of customer risk are identified. Then, these features are employed to introduce observed non-linear relationships between the explanatory and dependent variables into traditional scorecards. Moreover, instead of directly computing the Weight of Evidence (WoE) from good and bad data points, the Hybrid Model tries to match the score distribution generated by a Machine Learning algorithm, which ends up providing an estimate of the WoE for each bin. This capability helps to build powerful scorecards with sparse cases that cannot be achieved with traditional approaches. The proposed Hybrid Model is tested on different portfolios where a significant gap is observed between the performance of traditional scorecards and Machine Learning models. The result of analysis shows that Hybrid Model can improve the performance of traditional scorecards by introducing non-linear relationships between explanatory and target variables from Machine Learning models into traditional scorecards. Also, it is observed that in some scenarios the Hybrid Model can be almost as predictive as the Machine Learning techniques while being as transparent as traditional scorecards. Therefore, it is concluded that, with the use of Hybrid Model, Machine Learning algorithms can be used in the commercial banking industry without being concerned with difficulties in explaining the models for regulatory purposes.

Keywords: machine learning algorithms, scorecard, commercial banking, consumer risk, feature engineering

Procedia PDF Downloads 93
13503 Assessing the Effectiveness of Machine Learning Algorithms for Cyber Threat Intelligence Discovery from the Darknet

Authors: Azene Zenebe

Abstract:

Deep learning is a subset of machine learning which incorporates techniques for the construction of artificial neural networks and found to be useful for modeling complex problems with large dataset. Deep learning requires a very high power computational and longer time for training. By aggregating computing power, high performance computer (HPC) has emerged as an approach to resolving advanced problems and performing data-driven research activities. Cyber threat intelligence (CIT) is actionable information or insight an organization or individual uses to understand the threats that have, will, or are currently targeting the organization. Results of review of literature will be presented along with results of experimental study that compares the performance of tree-based and function-base machine learning including deep learning algorithms using secondary dataset collected from darknet.

Keywords: deep-learning, cyber security, cyber threat modeling, tree-based machine learning, function-based machine learning, data science

Procedia PDF Downloads 116
13502 Using Machine Learning Techniques for Autism Spectrum Disorder Analysis and Detection in Children

Authors: Norah Mohammed Alshahrani, Abdulaziz Almaleh

Abstract:

Autism Spectrum Disorder (ASD) is a condition related to issues with brain development that affects how a person recognises and communicates with others which results in difficulties with interaction and communication socially and it is constantly growing. Early recognition of ASD allows children to lead safe and healthy lives and helps doctors with accurate diagnoses and management of conditions. Therefore, it is crucial to develop a method that will achieve good results and with high accuracy for the measurement of ASD in children. In this paper, ASD datasets of toddlers and children have been analyzed. We employed the following machine learning techniques to attempt to explore ASD and they are Random Forest (RF), Decision Tree (DT), Na¨ıve Bayes (NB) and Support Vector Machine (SVM). Then Feature selection was used to provide fewer attributes from ASD datasets while preserving model performance. As a result, we found that the best result has been provided by the Support Vector Machine (SVM), achieving 0.98% in the toddler dataset and 0.99% in the children dataset.

Keywords: autism spectrum disorder, machine learning, feature selection, support vector machine

Procedia PDF Downloads 108
13501 Modern Machine Learning Conniptions for Automatic Speech Recognition

Authors: S. Jagadeesh Kumar

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 411
13500 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 44
13499 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 58
13498 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 129
13497 A Comparative Study of Malware Detection Techniques Using Machine Learning Methods

Authors: Cristina Vatamanu, Doina Cosovan, Dragos Gavrilut, Henri Luchian

Abstract:

In the past few years, the amount of malicious software increased exponentially and, therefore, machine learning algorithms became instrumental in identifying clean and malware files through semi-automated classification. When working with very large datasets, the major challenge is to reach both a very high malware detection rate and a very low false positive rate. Another challenge is to minimize the time needed for the machine learning algorithm to do so. This paper presents a comparative study between different machine learning techniques such as linear classifiers, ensembles, decision trees or various hybrids thereof. The training dataset consists of approximately 2 million clean files and 200.000 infected files, which is a realistic quantitative mixture. The paper investigates the above mentioned methods with respect to both their performance (detection rate and false positive rate) and their practicability.

Keywords: ensembles, false positives, feature selection, one side class algorithm

Procedia PDF Downloads 256
13496 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 333
13495 Tongue Image Retrieval Based Using Machine Learning

Authors: Ahmad FAROOQ, Xinfeng Zhang, Fahad Sabah, Raheem Sarwar

Abstract:

In Traditional Chinese Medicine, tongue diagnosis is a vital inspection tool (TCM). In this study, we explore the potential of machine learning in tongue diagnosis. It begins with the cataloguing of the various classifications and characteristics of the human tongue. We infer 24 kinds of tongues from the material and coating of the tongue, and we identify 21 attributes of the tongue. The next step is to apply machine learning methods to the tongue dataset. We use the Weka machine learning platform to conduct the experiment for performance analysis. The 457 instances of the tongue dataset are used to test the performance of five different machine learning methods, including SVM, Random Forests, Decision Trees, and Naive Bayes. Based on accuracy and Area under the ROC Curve, the Support Vector Machine algorithm was shown to be the most effective for tongue diagnosis (AUC).

Keywords: medical imaging, image retrieval, machine learning, tongue

Procedia PDF Downloads 38
13494 Modern Proteomics and the Application of Machine Learning Analyses in Proteomic Studies of Chronic Kidney Disease of Unknown Etiology

Authors: Dulanjali Ranasinghe, Isuru Supasan, Kaushalya Premachandra, Ranjan Dissanayake, Ajith Rajapaksha, Eustace Fernando

Abstract:

Proteomics studies of organisms are considered to be significantly information-rich compared to their genomic counterparts because proteomes of organisms represent the expressed state of all proteins of an organism at a given time. In modern top-down and bottom-up proteomics workflows, the primary analysis methods employed are gel–based methods such as two-dimensional (2D) electrophoresis and mass spectrometry based methods. Machine learning (ML) and artificial intelligence (AI) have been used increasingly in modern biological data analyses. In particular, the fields of genomics, DNA sequencing, and bioinformatics have seen an incremental trend in the usage of ML and AI techniques in recent years. The use of aforesaid techniques in the field of proteomics studies is only beginning to be materialised now. Although there is a wealth of information available in the scientific literature pertaining to proteomics workflows, no comprehensive review addresses various aspects of the combined use of proteomics and machine learning. The objective of this review is to provide a comprehensive outlook on the application of machine learning into the known proteomics workflows in order to extract more meaningful information that could be useful in a plethora of applications such as medicine, agriculture, and biotechnology.

Keywords: proteomics, machine learning, gel-based proteomics, mass spectrometry

Procedia PDF Downloads 121
13493 Predicting Machine-Down of Woodworking Industrial Machines

Authors: Matteo Calabrese, Martin Cimmino, Dimos Kapetis, Martina Manfrin, Donato Concilio, Giuseppe Toscano, Giovanni Ciandrini, Giancarlo Paccapeli, Gianluca Giarratana, Marco Siciliano, Andrea Forlani, Alberto Carrotta

Abstract:

In this paper we describe a machine learning methodology for Predictive Maintenance (PdM) applied on woodworking industrial machines. PdM is a prominent strategy consisting of all the operational techniques and actions required to ensure machine availability and to prevent a machine-down failure. One of the challenges with PdM approach is to design and develop of an embedded smart system to enable the health status of the machine. The proposed approach allows screening simultaneously multiple connected machines, thus providing real-time monitoring that can be adopted with maintenance management. This is achieved by applying temporal feature engineering techniques and training an ensemble of classification algorithms to predict Remaining Useful Lifetime of woodworking machines. The effectiveness of the methodology is demonstrated by testing an independent sample of additional woodworking machines without presenting machine down event.

Keywords: predictive maintenance, machine learning, connected machines, artificial intelligence

Procedia PDF Downloads 190
13492 Applications of AI, Machine Learning, and Deep Learning in Cyber Security

Authors: Hailyie Tekleselase

Abstract:

Deep learning is increasingly used as a building block of security systems. However, neural networks are hard to interpret and typically solid to the practitioner. This paper presents a detail survey of computing methods in cyber security, and analyzes the prospects of enhancing the cyber security capabilities by suggests that of accelerating the intelligence of the security systems. There are many AI-based applications used in industrial scenarios such as Internet of Things (IoT), smart grids, and edge computing. Machine learning technologies require a training process which introduces the protection problems in the training data and algorithms. We present machine learning techniques currently applied to the detection of intrusion, malware, and spam. Our conclusions are based on an extensive review of the literature as well as on experiments performed on real enterprise systems and network traffic. We conclude that problems can be solved successfully only when methods of artificial intelligence are being used besides human experts or operators.

Keywords: artificial intelligence, machine learning, deep learning, cyber security, big data

Procedia PDF Downloads 94
13491 System for the Detecting of Fake Profiles on Online Social Networks Using Machine Learning and the Bio-Inspired Algorithms

Authors: Sekkal Nawel, Mahammed Nadir

Abstract:

The proliferation of online activities on Online Social Networks (OSNs) has captured significant user attention. However, this growth has been hindered by the emergence of fraudulent accounts that do not represent real individuals and violate privacy regulations within social network communities. Consequently, it is imperative to identify and remove these profiles to enhance the security of OSN users. In recent years, researchers have turned to machine learning (ML) to develop strategies and methods to tackle this issue. Numerous studies have been conducted in this field to compare various ML-based techniques. However, the existing literature still lacks a comprehensive examination, especially considering different OSN platforms. Additionally, the utilization of bio-inspired algorithms has been largely overlooked. Our study conducts an extensive comparison analysis of various fake profile detection techniques in online social networks. The results of our study indicate that supervised models, along with other machine learning techniques, as well as unsupervised models, are effective for detecting false profiles in social media. To achieve optimal results, we have incorporated six bio-inspired algorithms to enhance the performance of fake profile identification results.

Keywords: machine learning, bio-inspired algorithm, detection, fake profile, system, social network

Procedia PDF Downloads 31