Search results for: automated machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8791

Search results for: automated machine learning

8611 Using Autoencoder as Feature Extractor for Malware Detection

Authors: Umm-E-Hani, Faiza Babar, Hanif Durad

Abstract:

Malware-detecting approaches suffer many limitations, due to which all anti-malware solutions have failed to be reliable enough for detecting zero-day malware. Signature-based solutions depend upon the signatures that can be generated only when malware surfaces at least once in the cyber world. Another approach that works by detecting the anomalies caused in the environment can easily be defeated by diligently and intelligently written malware. Solutions that have been trained to observe the behavior for detecting malicious files have failed to cater to the malware capable of detecting the sandboxed or protected environment. Machine learning and deep learning-based approaches greatly suffer in training their models with either an imbalanced dataset or an inadequate number of samples. AI-based anti-malware solutions that have been trained with enough samples targeted a selected feature vector, thus ignoring the input of leftover features in the maliciousness of malware just to cope with the lack of underlying hardware processing power. Our research focuses on producing an anti-malware solution for detecting malicious PE files by circumventing the earlier-mentioned shortcomings. Our proposed framework, which is based on automated feature engineering through autoencoders, trains the model over a fairly large dataset. It focuses on the visual patterns of malware samples to automatically extract the meaningful part of the visual pattern. Our experiment has successfully produced a state-of-the-art accuracy of 99.54 % over test data.

Keywords: malware, auto encoders, automated feature engineering, classification

Procedia PDF Downloads 49
8610 Spontaneous and Posed Smile Detection: Deep Learning, Traditional Machine Learning, and Human Performance

Authors: Liang Wang, Beste F. Yuksel, David Guy Brizan

Abstract:

A computational model of affect that can distinguish between spontaneous and posed smiles with no errors on a large, popular data set using deep learning techniques is presented in this paper. A Long Short-Term Memory (LSTM) classifier, a type of Recurrent Neural Network, is utilized and compared to human classification. Results showed that while human classification (mean of 0.7133) was above chance, the LSTM model was more accurate than human classification and other comparable state-of-the-art systems. Additionally, a high accuracy rate was maintained with small amounts of training videos (70 instances). The derivation of important features to further understand the success of our computational model were analyzed, and it was inferred that thousands of pairs of points within the eyes and mouth are important throughout all time segments in a smile. This suggests that distinguishing between a posed and spontaneous smile is a complex task, one which may account for the difficulty and lower accuracy of human classification compared to machine learning models.

Keywords: affective computing, affect detection, computer vision, deep learning, human-computer interaction, machine learning, posed smile detection, spontaneous smile detection

Procedia PDF Downloads 105
8609 Analysis of Splicing Methods for High Speed Automated Fibre Placement Applications

Authors: Phillip Kearney, Constantina Lekakou, Stephen Belcher, Alessandro Sordon

Abstract:

The focus in the automotive industry is to reduce human operator and machine interaction, so manufacturing becomes more automated and safer. The aim is to lower part cost and construction time as well as defects in the parts, sometimes occurring due to the physical limitations of human operators. A move to automate the layup of reinforcement material in composites manufacturing has resulted in the use of tapes that are placed in position by a robotic deposition head, also described as Automated Fibre Placement (AFP). The process of AFP is limited with respect to the finite amount of material that can be loaded into the machine at any one time. Joining two batches of tape material together involves a splice to secure the ends of the finishing tape to the starting edge of the new tape. The splicing method of choice for the majority of prepreg applications is a hand stich method, and as the name suggests requires human input to achieve. This investigation explores three methods for automated splicing, namely, adhesive, binding and stitching. The adhesive technique uses an additional adhesive placed on the tape ends to be joined. Binding uses the binding agent that is already impregnated onto the tape through the application of heat. The stitching method is used as a baseline to compare the new splicing methods to the traditional technique currently in use. As the methods will be used within a High Speed Automated Fibre Placement (HSAFP) process, this meant the parameters of the splices have to meet certain specifications: (a) the splice must be able to endure a load of 50 N in tension applied at a rate of 1 mm/s; (b) the splice must be created in less than 6 seconds, dictated by the capacity of the tape accumulator within the system. The samples for experimentation were manufactured with controlled overlaps, alignment and splicing parameters, these were then tested in tension using a tensile testing machine. Initial analysis explored the use of the impregnated binding agent present on the tape, as in the binding splicing technique. It analysed the effect of temperature and overlap on the strength of the splice. It was found that the optimum splicing temperature was at the higher end of the activation range of the binding agent, 100 °C. The optimum overlap was found to be 25 mm; it was found that there was no improvement in bond strength from 25 mm to 30 mm overlap. The final analysis compared the different splicing methods to the baseline of a stitched bond. It was found that the addition of an adhesive was the best splicing method, achieving a maximum load of over 500 N compared to the 26 N load achieved by a stitching splice and 94 N by the binding method.

Keywords: analysis, automated fibre placement, high speed, splicing

Procedia PDF Downloads 120
8608 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 118
8607 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 98
8606 A Machine Learning Approach for Intelligent Transportation System Management on Urban Roads

Authors: Ashish Dhamaniya, Vineet Jain, Rajesh Chouhan

Abstract:

Traffic management is one of the gigantic issue in most of the urban roads in al-most all metropolitan cities in India. Speed is one of the critical traffic parameters for effective Intelligent Transportation System (ITS) implementation as it decides the arrival rate of vehicles on an intersection which are majorly the point of con-gestions. The study aimed to leverage Machine Learning (ML) models to produce precise predictions of speed on urban roadway links. The research objective was to assess how categorized traffic volume and road width, serving as variables, in-fluence speed prediction. Four tree-based regression models namely: Decision Tree (DT), Random Forest (RF), Extra Tree (ET), and Extreme Gradient Boost (XGB)are employed for this purpose. The models' performances were validated using test data, and the results demonstrate that Random Forest surpasses other machine learning techniques and a conventional utility theory-based model in speed prediction. The study is useful for managing the urban roadway network performance under mixed traffic conditions and effective implementation of ITS.

Keywords: stream speed, urban roads, machine learning, traffic flow

Procedia PDF Downloads 33
8605 Computing Machinery and Legal Intelligence: Towards a Reflexive Model for Computer Automated Decision Support in Public Administration

Authors: Jacob Livingston Slosser, Naja Holten Moller, Thomas Troels Hildebrandt, Henrik Palmer Olsen

Abstract:

In this paper, we propose a model for human-AI interaction in public administration that involves legal decision-making. Inspired by Alan Turing’s test for machine intelligence, we propose a way of institutionalizing a continuous working relationship between man and machine that aims at ensuring both good legal quality and higher efficiency in decision-making processes in public administration. We also suggest that our model enhances the legitimacy of using AI in public legal decision-making. We suggest that case loads in public administration could be divided between a manual and an automated decision track. The automated decision track will be an algorithmic recommender system trained on former cases. To avoid unwanted feedback loops and biases, part of the case load will be dealt with by both a human case worker and the automated recommender system. In those cases an experienced human case worker will have the role of an evaluator, choosing between the two decisions. This model will ensure that the algorithmic recommender system is not compromising the quality of the legal decision making in the institution. It also enhances the legitimacy of using algorithmic decision support because it provides justification for its use by being seen as superior to human decisions when the algorithmic recommendations are preferred by experienced case workers. The paper outlines in some detail the process through which such a model could be implemented. It also addresses the important issue that legal decision making is subject to legislative and judicial changes and that legal interpretation is context sensitive. Both of these issues requires continuous supervision and adjustments to algorithmic recommender systems when used for legal decision making purposes.

Keywords: administrative law, algorithmic decision-making, decision support, public law

Procedia PDF Downloads 188
8604 Predicting Potential Protein Therapeutic Candidates from the Gut Microbiome

Authors: Prasanna Ramachandran, Kareem Graham, Helena Kiefel, Sunit Jain, Todd DeSantis

Abstract:

Microbes that reside inside the mammalian GI tract, commonly referred to as the gut microbiome, have been shown to have therapeutic effects in animal models of disease. We hypothesize that specific proteins produced by these microbes are responsible for this activity and may be used directly as therapeutics. To speed up the discovery of these key proteins from the big-data metagenomics, we have applied machine learning techniques. Using amino acid sequences of known epitopes and their corresponding binding partners, protein interaction descriptors (PID) were calculated, making a positive interaction set. A negative interaction dataset was calculated using sequences of proteins known not to interact with these same binding partners. Using Random Forest and positive and negative PID, a machine learning model was trained and used to predict interacting versus non-interacting proteins. Furthermore, the continuous variable, cosine similarity in the interaction descriptors was used to rank bacterial therapeutic candidates. Laboratory binding assays were conducted to test the candidates for their potential as therapeutics. Results from binding assays reveal the accuracy of the machine learning prediction and are subsequently used to further improve the model.

Keywords: protein-interactions, machine-learning, metagenomics, microbiome

Procedia PDF Downloads 343
8603 Harnessing Artificial Intelligence and Machine Learning for Advanced Fraud Detection and Prevention

Authors: Avinash Malladhi

Abstract:

Forensic accounting is a specialized field that involves the application of accounting principles, investigative skills, and legal knowledge to detect and prevent fraud. With the rise of big data and technological advancements, artificial intelligence (AI) and machine learning (ML) algorithms have emerged as powerful tools for forensic accountants to enhance their fraud detection capabilities. In this paper, we review and analyze various AI/ML algorithms that are commonly used in forensic accounting, including supervised and unsupervised learning, deep learning, natural language processing Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs), Decision Trees, and Random Forests. We discuss their underlying principles, strengths, and limitations and provide empirical evidence from existing research studies demonstrating their effectiveness in detecting financial fraud. We also highlight potential ethical considerations and challenges associated with using AI/ML in forensic accounting. Furthermore, we highlight the benefits of these technologies in improving fraud detection and prevention in forensic accounting.

Keywords: AI, machine learning, forensic accounting & fraud detection, anti money laundering, Benford's law, fraud triangle theory

Procedia PDF Downloads 60
8602 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour

Procedia PDF Downloads 318
8601 Churn Prediction for Savings Bank Customers: A Machine Learning Approach

Authors: Prashant Verma

Abstract:

Commercial banks are facing immense pressure, including financial disintermediation, interest rate volatility and digital ways of finance. Retaining an existing customer is 5 to 25 less expensive than acquiring a new one. This paper explores customer churn prediction, based on various statistical & machine learning models and uses under-sampling, to improve the predictive power of these models. The results show that out of the various machine learning models, Random Forest which predicts the churn with 78% accuracy, has been found to be the most powerful model for the scenario. Customer vintage, customer’s age, average balance, occupation code, population code, average withdrawal amount, and an average number of transactions were found to be the variables with high predictive power for the churn prediction model. The model can be deployed by the commercial banks in order to avoid the customer churn so that they may retain the funds, which are kept by savings bank (SB) customers. The article suggests a customized campaign to be initiated by commercial banks to avoid SB customer churn. Hence, by giving better customer satisfaction and experience, the commercial banks can limit the customer churn and maintain their deposits.

Keywords: savings bank, customer churn, customer retention, random forests, machine learning, under-sampling

Procedia PDF Downloads 110
8600 Supervised Machine Learning Approach for Studying the Effect of Different Joint Sets on Stability of Mine Pit Slopes Under the Presence of Different External Factors

Authors: Sudhir Kumar Singh, Debashish Chakravarty

Abstract:

Slope stability analysis is an important aspect in the field of geotechnical engineering. It is also important from safety, and economic point of view as any slope failure leads to loss of valuable lives and damage to property worth millions. This paper aims at mitigating the risk of slope failure by studying the effect of different joint sets on the stability of mine pit slopes under the influence of various external factors, namely degree of saturation, rainfall intensity, and seismic coefficients. Supervised machine learning approach has been utilized for making accurate and reliable predictions regarding the stability of slopes based on the value of Factor of Safety. Numerous cases have been studied for analyzing the stability of slopes using the popular Finite Element Method, and the data thus obtained has been used as training data for the supervised machine learning models. The input data has been trained on different supervised machine learning models, namely Random Forest, Decision Tree, Support vector Machine, and XGBoost. Distinct test data that is not present in training data has been used for measuring the performance and accuracy of different models. Although all models have performed well on the test dataset but Random Forest stands out from others due to its high accuracy of greater than 95%, thus helping us by providing a valuable tool at our disposition which is neither computationally expensive nor time consuming and in good accordance with the numerical analysis result.

Keywords: finite element method, geotechnical engineering, machine learning, slope stability

Procedia PDF Downloads 75
8599 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 110
8598 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: machine learning, medical diagnosis, meningitis detection, pediatric research

Procedia PDF Downloads 126
8597 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: computational social science, movie preference, machine learning, SVM

Procedia PDF Downloads 238
8596 Career Guidance System Using Machine Learning

Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan

Abstract:

Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should properly evaluate their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, Neural Networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable to offer an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.

Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills

Procedia PDF Downloads 58
8595 Career Guidance System Using Machine Learning

Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan

Abstract:

Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should evaluate properly their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, neural networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable of offering an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.

Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills

Procedia PDF Downloads 46
8594 Cardiovascular Disease Prediction Using Machine Learning Approaches

Authors: P. Halder, A. Zaman

Abstract:

It is estimated that heart disease accounts for one in ten deaths worldwide. United States deaths due to heart disease are among the leading causes of death according to the World Health Organization. Cardiovascular diseases (CVDs) account for one in four U.S. deaths, according to the Centers for Disease Control and Prevention (CDC). According to statistics, women are more likely than men to die from heart disease as a result of strokes. A 50% increase in men's mortality was reported by the World Health Organization in 2009. The consequences of cardiovascular disease are severe. The causes of heart disease include diabetes, high blood pressure, high cholesterol, abnormal pulse rates, etc. Machine learning (ML) can be used to make predictions and decisions in the healthcare industry. Thus, scientists have turned to modern technologies like Machine Learning and Data Mining to predict diseases. The disease prediction is based on four algorithms. Compared to other boosts, the Ada boost is much more accurate.

Keywords: heart disease, cardiovascular disease, coronary artery disease, feature selection, random forest, AdaBoost, SVM, decision tree

Procedia PDF Downloads 129
8593 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 435
8592 A Machine Learning Framework Based on Biometric Measurements for Automatic Fetal Head Anomalies Diagnosis in Ultrasound Images

Authors: Hanene Sahli, Aymen Mouelhi, Marwa Hajji, Amine Ben Slama, Mounir Sayadi, Farhat Fnaiech, Radhwane Rachdi

Abstract:

Fetal abnormality is still a public health problem of interest to both mother and baby. Head defect is one of the most high-risk fetal deformities. Fetal head categorization is a sensitive task that needs a massive attention from neurological experts. In this sense, biometrical measurements can be extracted by gynecologist doctors and compared with ground truth charts to identify normal or abnormal growth. The fetal head biometric measurements such as Biparietal Diameter (BPD), Occipito-Frontal Diameter (OFD) and Head Circumference (HC) needs to be monitored, and expert should carry out its manual delineations. This work proposes a new approach to automatically compute BPD, OFD and HC based on morphological characteristics extracted from head shape. Hence, the studied data selected at the same Gestational Age (GA) from the fetal Ultrasound images (US) are classified into two categories: Normal and abnormal. The abnormal subjects include hydrocephalus, microcephaly and dolichocephaly anomalies. By the use of a support vector machines (SVM) method, this study achieved high classification for automated detection of anomalies. The proposed method is promising although it doesn't need expert interventions.

Keywords: biometric measurements, fetal head malformations, machine learning methods, US images

Procedia PDF Downloads 259
8591 Systematic and Meta-Analysis of Navigation in Oral and Maxillofacial Trauma and Impact of Machine Learning and AI in Management

Authors: Shohreh Ghasemi

Abstract:

Introduction: Managing oral and maxillofacial trauma is a multifaceted challenge, as it can have life-threatening consequences and significant functional and aesthetic impact. Navigation techniques have been introduced to improve surgical precision to meet this challenge. A machine learning algorithm was also developed to support clinical decision-making regarding treating oral and maxillofacial trauma. Given these advances, this systematic meta-analysis aims to assess the efficacy of navigational techniques in treating oral and maxillofacial trauma and explore the impact of machine learning on their management. Methods: A detailed and comprehensive analysis of studies published between January 2010 and September 2021 was conducted through a systematic meta-analysis. This included performing a thorough search of Web of Science, Embase, and PubMed databases to identify studies evaluating the efficacy of navigational techniques and the impact of machine learning in managing oral and maxillofacial trauma. Studies that did not meet established entry criteria were excluded. In addition, the overall quality of studies included was evaluated using Cochrane risk of bias tool and the Newcastle-Ottawa scale. Results: Total of 12 studies, including 869 patients with oral and maxillofacial trauma, met the inclusion criteria. An analysis of studies revealed that navigation techniques effectively improve surgical accuracy and minimize the risk of complications. Additionally, machine learning algorithms have proven effective in predicting treatment outcomes and identifying patients at high risk for complications. Conclusion: The introduction of navigational technology has great potential to improve surgical precision in oral and maxillofacial trauma treatment. Furthermore, developing machine learning algorithms offers opportunities to improve clinical decision-making and patient outcomes. Still, further studies are necessary to corroborate these results and establish the optimal use of these technologies in managing oral and maxillofacial trauma

Keywords: trauma, machine learning, navigation, maxillofacial, management

Procedia PDF Downloads 40
8590 Improving Machine Learning Translation of Hausa Using Named Entity Recognition

Authors: Aishatu Ibrahim Birma, Aminu Tukur, Abdulkarim Abbass Gora

Abstract:

Machine translation plays a vital role in the Field of Natural Language Processing (NLP), breaking down language barriers and enabling communication across diverse communities. In the context of Hausa, a widely spoken language in West Africa, mainly in Nigeria, effective translation systems are essential for enabling seamless communication and promoting cultural exchange. However, due to the unique linguistic characteristics of Hausa, accurate translation remains a challenging task. The research proposes an approach to improving the machine learning translation of Hausa by integrating Named Entity Recognition (NER) techniques. Named entities, such as person names, locations, organizations, and dates, are critical components of a language's structure and meaning. Incorporating NER into the translation process can enhance the quality and accuracy of translations by preserving the integrity of named entities and also maintaining consistency in translating entities (e.g., proper names), and addressing the cultural references specific to Hausa. The NER will be incorporated into Neural Machine Translation (NMT) for the Hausa to English Translation.

Keywords: machine translation, natural language processing (NLP), named entity recognition (NER), neural machine translation (NMT)

Procedia PDF Downloads 12
8589 Machine Learning Approach for Lateralization of Temporal Lobe Epilepsy

Authors: Samira-Sadat JamaliDinan, Haidar Almohri, Mohammad-Reza Nazem-Zadeh

Abstract:

Lateralization of temporal lobe epilepsy (TLE) is very important for positive surgical outcomes. We propose a machine learning framework to ultimately identify the epileptogenic hemisphere for temporal lobe epilepsy (TLE) cases using magnetoencephalography (MEG) coherence source imaging (CSI) and diffusion tensor imaging (DTI). Unlike most studies that use classification algorithms, we propose an effective clustering approach to distinguish between normal and TLE cases. We apply the famous Minkowski weighted K-Means (MWK-Means) technique as the clustering framework. To overcome the problem of poor initialization of K-Means, we use particle swarm optimization (PSO) to effectively select the initial centroids of clusters prior to applying MWK-Means. We demonstrate that compared to K-means and MWK-means independently, this approach is able to improve the result of a benchmark data set.

Keywords: temporal lobe epilepsy, machine learning, clustering, magnetoencephalography

Procedia PDF Downloads 125
8588 A System to Detect Inappropriate Messages in Online Social Networks

Authors: Shivani Singh, Shantanu Nakhare, Kalyani Nair, Rohan Shetty

Abstract:

As social networking is growing at a rapid pace today it is vital that we work on improving its management. Research has shown that the content present in online social networks may have significant influence on impressionable minds. If such platforms are misused, it will lead to negative consequences. Detecting insults or inappropriate messages continues to be one of the most challenging aspects of Online Social Networks (OSNs) today. We address this problem through a Machine Learning Based Soft Text Classifier approach using Support Vector Machine algorithm. The proposed system acts as a screening mechanism the alerts the user about such messages. The messages are classified according to their subject matter and each comment is labeled for the presence of profanity and insults.

Keywords: machine learning, online social networks, soft text classifier, support vector machine

Procedia PDF Downloads 478
8587 System for the Detecting of Fake Profiles on Online Social Networks Using Machine Learning and the Bio-Inspired Algorithms

Authors: Sekkal Nawel, Mahammed Nadir

Abstract:

The proliferation of online activities on Online Social Networks (OSNs) has captured significant user attention. However, this growth has been hindered by the emergence of fraudulent accounts that do not represent real individuals and violate privacy regulations within social network communities. Consequently, it is imperative to identify and remove these profiles to enhance the security of OSN users. In recent years, researchers have turned to machine learning (ML) to develop strategies and methods to tackle this issue. Numerous studies have been conducted in this field to compare various ML-based techniques. However, the existing literature still lacks a comprehensive examination, especially considering different OSN platforms. Additionally, the utilization of bio-inspired algorithms has been largely overlooked. Our study conducts an extensive comparison analysis of various fake profile detection techniques in online social networks. The results of our study indicate that supervised models, along with other machine learning techniques, as well as unsupervised models, are effective for detecting false profiles in social media. To achieve optimal results, we have incorporated six bio-inspired algorithms to enhance the performance of fake profile identification results.

Keywords: machine learning, bio-inspired algorithm, detection, fake profile, system, social network

Procedia PDF Downloads 40
8586 Predicting the Frequencies of Tropical Cyclone-Induced Rainfall Events in the US Using a Machine-Learning Model

Authors: Elham Sharifineyestani, Mohammad Farshchin

Abstract:

Tropical cyclones are one of the most expensive and deadliest natural disasters. They cause heavy rainfall and serious flash flooding that result in billions of dollars of damage and considerable mortality each year in the United States. Prediction of the frequency of tropical cyclone-induced rainfall events can be helpful in emergency planning and flood risk management. In this study, we have developed a machine-learning model to predict the exceedance frequencies of tropical cyclone-induced rainfall events in the United States. Model results show a satisfactory agreement with available observations. To examine the effectiveness of our approach, we also have compared the result of our predictions with the exceedance frequencies predicted using a physics-based rainfall model by Feldmann.

Keywords: flash flooding, tropical cyclones, frequencies, machine learning, risk management

Procedia PDF Downloads 218
8585 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 50
8584 Integration of Big Data to Predict Transportation for Smart Cities

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system.  The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Keywords: big data, machine learning, smart city, social cost, transportation network

Procedia PDF Downloads 231
8583 Loan Repayment Prediction Using Machine Learning: Model Development, Django Web Integration and Cloud Deployment

Authors: Seun Mayowa Sunday

Abstract:

Loan prediction is one of the most significant and recognised fields of research in the banking, insurance, and the financial security industries. Some prediction systems on the market include the construction of static software. However, due to the fact that static software only operates with strictly regulated rules, they cannot aid customers beyond these limitations. Application of many machine learning (ML) techniques are required for loan prediction. Four separate machine learning models, random forest (RF), decision tree (DT), k-nearest neighbour (KNN), and logistic regression, are used to create the loan prediction model. Using the anaconda navigator and the required machine learning (ML) libraries, models are created and evaluated using the appropriate measuring metrics. From the finding, the random forest performs with the highest accuracy of 80.17% which was later implemented into the Django framework. For real-time testing, the web application is deployed on the Alibabacloud which is among the top 4 biggest cloud computing provider. Hence, to the best of our knowledge, this research will serve as the first academic paper which combines the model development and the Django framework, with the deployment into the Alibaba cloud computing application.

Keywords: k-nearest neighbor, random forest, logistic regression, decision tree, django, cloud computing, alibaba cloud

Procedia PDF Downloads 103
8582 Breast Cancer Diagnosing Based on Online Sequential Extreme Learning Machine Approach

Authors: Musatafa Abbas Abbood Albadr, Masri Ayob, Sabrina Tiun, Fahad Taha Al-Dhief, Mohammad Kamrul Hasan

Abstract:

Breast Cancer (BC) is considered one of the most frequent reasons of cancer death in women between 40 to 55 ages. The BC is diagnosed by using digital images of the FNA (Fine Needle Aspirate) for both benign and malignant tumors of the breast mass. Therefore, this work proposes the Online Sequential Extreme Learning Machine (OSELM) algorithm for diagnosing BC by using the tumor features of the breast mass. The current work has used the Wisconsin Diagnosis Breast Cancer (WDBC) dataset, which contains 569 samples (i.e., 357 samples for benign class and 212 samples for malignant class). Further, numerous measurements of assessment were used in order to evaluate the proposed OSELM algorithm, such as specificity, precision, F-measure, accuracy, G-mean, MCC, and recall. According to the outcomes of the experiment, the highest performance of the proposed OSELM was accomplished with 97.66% accuracy, 98.39% recall, 95.31% precision, 97.25% specificity, 96.83% F-measure, 95.00% MCC, and 96.84% G-Mean. The proposed OSELM algorithm demonstrates promising results in diagnosing BC. Besides, the performance of the proposed OSELM algorithm was superior to all its comparatives with respect to the rate of classification.

Keywords: breast cancer, machine learning, online sequential extreme learning machine, artificial intelligence

Procedia PDF Downloads 89