Search results for: statistical machine learning
12038 A Review of Machine Learning for Big Data
Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.
Abstract:
Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.Keywords: active learning, big data, deep learning, machine learning
Procedia PDF Downloads 44512037 Detect QOS Attacks Using Machine Learning Algorithm
Authors: Christodoulou Christos, Politis Anastasios
Abstract:
A large majority of users favoured to wireless LAN connection since it was so simple to use. A wireless network can be the target of numerous attacks. Class hijacking is a well-known attack that is fairly simple to execute and has significant repercussions on users. The statistical flow analysis based on machine learning (ML) techniques is a promising categorization methodology. In a given dataset, which in the context of this paper is a collection of components representing frames belonging to various flows, machine learning (ML) can offer a technique for identifying and characterizing structural patterns. It is possible to classify individual packets using these patterns. It is possible to identify fraudulent conduct, such as class hijacking, and take necessary action as a result. In this study, we explore a way to use machine learning approaches to thwart this attack.Keywords: wireless lan, quality of service, machine learning, class hijacking, EDCA remapping
Procedia PDF Downloads 6112036 TDApplied: An R Package for Machine Learning and Inference with Persistence Diagrams
Authors: Shael Brown, Reza Farivar
Abstract:
Persistence diagrams capture valuable topological features of datasets that other methods cannot uncover. Still, their adoption in data pipelines has been limited due to the lack of publicly available tools in R (and python) for analyzing groups of them with machine learning and statistical inference. In an easy-to-use and scalable R package called TDApplied, we implement several applied analysis methods tailored to groups of persistence diagrams. The two main contributions of our package are comprehensiveness (most functions do not have implementations elsewhere) and speed (shown through benchmarking against other R packages). We demonstrate applications of the tools on simulated data to illustrate how easily practical analyses of any dataset can be enhanced with topological information.Keywords: machine learning, persistence diagrams, R, statistical inference
Procedia PDF Downloads 8512035 Predictive Maintenance of Industrial Shredders: Efficient Operation through Real-Time Monitoring Using Statistical Machine Learning
Authors: Federico Pittino, Thomas Arnold
Abstract:
The shredding of waste materials is a key step in the recycling process towards the circular economy. Industrial shredders for waste processing operate in very harsh operating conditions, leading to the need for frequent maintenance of critical components. Maintenance optimization is particularly important also to increase the machine’s efficiency, thereby reducing the operational costs. In this work, a monitoring system has been developed and deployed on an industrial shredder located at a waste recycling plant in Austria. The machine has been monitored for one year, and methods for predictive maintenance have been developed for two key components: the cutting knives and the drive belt. The large amount of collected data is leveraged by statistical machine learning techniques, thereby not requiring very detailed knowledge of the machine or its live operating conditions. The results show that, despite the wide range of operating conditions, a reliable estimate of the optimal time for maintenance can be derived. Moreover, the trade-off between the cost of maintenance and the increase in power consumption due to the wear state of the monitored components of the machine is investigated. This work proves the benefits of real-time monitoring system for the efficient operation of industrial shredders.Keywords: predictive maintenance, circular economy, industrial shredder, cost optimization, statistical machine learning
Procedia PDF Downloads 12212034 Modern Machine Learning Conniptions for Automatic Speech Recognition
Authors: S. Jagadeesh Kumar
Abstract:
This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning
Procedia PDF Downloads 44712033 Quantum Statistical Machine Learning and Quantum Time Series
Authors: Omar Alzeley, Sergey Utev
Abstract:
Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series
Procedia PDF Downloads 46912032 Tongue Image Retrieval Based Using Machine Learning
Authors: Ahmad FAROOQ, Xinfeng Zhang, Fahad Sabah, Raheem Sarwar
Abstract:
In Traditional Chinese Medicine, tongue diagnosis is a vital inspection tool (TCM). In this study, we explore the potential of machine learning in tongue diagnosis. It begins with the cataloguing of the various classifications and characteristics of the human tongue. We infer 24 kinds of tongues from the material and coating of the tongue, and we identify 21 attributes of the tongue. The next step is to apply machine learning methods to the tongue dataset. We use the Weka machine learning platform to conduct the experiment for performance analysis. The 457 instances of the tongue dataset are used to test the performance of five different machine learning methods, including SVM, Random Forests, Decision Trees, and Naive Bayes. Based on accuracy and Area under the ROC Curve, the Support Vector Machine algorithm was shown to be the most effective for tongue diagnosis (AUC).Keywords: medical imaging, image retrieval, machine learning, tongue
Procedia PDF Downloads 8112031 Optimize Data Evaluation Metrics for Fraud Detection Using Machine Learning
Authors: Jennifer Leach, Umashanger Thayasivam
Abstract:
The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, though, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate people. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease this advancement. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent data, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which testing split and technique would lead to the most optimal results.Keywords: data science, fraud detection, machine learning, supervised learning
Procedia PDF Downloads 19512030 Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process
Authors: Jan Stodt, Christoph Reich
Abstract:
The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.Keywords: audit, machine learning, assessment, metrics
Procedia PDF Downloads 27112029 Inferring Human Mobility in India Using Machine Learning
Authors: Asra Yousuf, Ajaykumar Tannirkulum
Abstract:
Inferring rural-urban migration trends can help design effective policies that promote better urban planning and rural development. In this paper, we describe how machine learning algorithms can be applied to predict internal migration decisions of people. We consider data collected from household surveys in Tamil Nadu to train our model. To measure the performance of the model, we use data on past migration from National Sample Survey Organisation of India. The factors for training the model include socioeconomic characteristic of each individual like age, gender, place of residence, outstanding loans, strength of the household, etc. and his past migration history. We perform a comparative analysis of the performance of a number of machine learning algorithm to determine their prediction accuracy. Our results show that machine learning algorithms provide a stronger prediction accuracy as compared to statistical models. Our goal through this research is to propose the use of data science techniques in understanding human decisions and behaviour in developing countries.Keywords: development, migration, internal migration, machine learning, prediction
Procedia PDF Downloads 27112028 Students' Statistical Reasoning and Attitudes towards Statistics in Blended Learning, E-Learning and On-Campus Learning
Authors: Petros Roussos
Abstract:
The present study focused on students' statistical reasoning related to Null Hypothesis Statistical Testing and p-values. Its objective was to test the hypothesis that neither the place (classroom, at a distance, online) nor the medium that actually supports the learning (ICT, internet, books) has an effect on understanding of statistical concepts. In addition, it was expected that students' attitudes towards statistics would not predict understanding of statistical concepts. The sample consisted of 385 undergraduate and postgraduate students from six state and private universities (five in Greece and one in Cyprus). Students were administered two questionnaires: a) the Greek version of the Survey of Attitudes Toward Statistics, and b) a short instrument which measures students' understanding of statistical significance and p-values. Results suggest that attitudes towards statistics do not predict students' understanding of statistical concepts, whereas the medium did not have an effect.Keywords: attitudes towards statistics, blended learning, e-learning, statistical reasoning
Procedia PDF Downloads 31012027 Comparing Machine Learning Estimation of Fuel Consumption of Heavy-Duty Vehicles
Authors: Victor Bodell, Lukas Ekstrom, Somayeh Aghanavesi
Abstract:
Fuel consumption (FC) is one of the key factors in determining expenses of operating a heavy-duty vehicle. A customer may therefore request an estimate of the FC of a desired vehicle. The modular design of heavy-duty vehicles allows their construction by specifying the building blocks, such as gear box, engine and chassis type. If the combination of building blocks is unprecedented, it is unfeasible to measure the FC, since this would first r equire the construction of the vehicle. This paper proposes a machine learning approach to predict FC. This study uses around 40,000 vehicles specific and o perational e nvironmental c onditions i nformation, such as road slopes and driver profiles. A ll v ehicles h ave d iesel engines and a mileage of more than 20,000 km. The data is used to investigate the accuracy of machine learning algorithms Linear regression (LR), K-nearest neighbor (KNN) and Artificial n eural n etworks (ANN) in predicting fuel consumption for heavy-duty vehicles. Performance of the algorithms is evaluated by reporting the prediction error on both simulated data and operational measurements. The performance of the algorithms is compared using nested cross-validation and statistical hypothesis testing. The statistical evaluation procedure finds that ANNs have the lowest prediction error compared to LR and KNN in estimating fuel consumption on both simulated and operational data. The models have a mean relative prediction error of 0.3% on simulated data, and 4.2% on operational data.Keywords: artificial neural networks, fuel consumption, friedman test, machine learning, statistical hypothesis testing
Procedia PDF Downloads 17812026 A Machine Learning Approach for Anomaly Detection in Environmental IoT-Driven Wastewater Purification Systems
Authors: Giovanni Cicceri, Roberta Maisano, Nathalie Morey, Salvatore Distefano
Abstract:
The main goal of this paper is to present a solution for a water purification system based on an Environmental Internet of Things (EIoT) platform to monitor and control water quality and machine learning (ML) models to support decision making and speed up the processes of purification of water. A real case study has been implemented by deploying an EIoT platform and a network of devices, called Gramb meters and belonging to the Gramb project, on wastewater purification systems located in Calabria, south of Italy. The data thus collected are used to control the wastewater quality, detect anomalies and predict the behaviour of the purification system. To this extent, three different statistical and machine learning models have been adopted and thus compared: Autoregressive Integrated Moving Average (ARIMA), Long Short Term Memory (LSTM) autoencoder, and Facebook Prophet (FP). The results demonstrated that the ML solution (LSTM) out-perform classical statistical approaches (ARIMA, FP), in terms of both accuracy, efficiency and effectiveness in monitoring and controlling the wastewater purification processes.Keywords: environmental internet of things, EIoT, machine learning, anomaly detection, environment monitoring
Procedia PDF Downloads 15112025 Prediction of Disability-Adjustment Mental Illness Using Machine Learning
Authors: S. R. M. Krishna, R. Santosh Kumar, V. Kamakshi Prasad
Abstract:
Machine learning techniques are applied for the analysis of the impact of mental illness on the burden of disease. It is calculated using the disability-adjusted life year (DALY). DALYs for a disease is the sum of years of life lost due to premature mortality (YLLs) + No of years of healthy life lost due to disability (YLDs). The critical analysis is done based on the Data sources, machine learning techniques and feature extraction method. The reviewing is done based on major databases. The extracted data is examined using statistical analysis and machine learning techniques were applied. The prediction of the impact of mental illness on the population using machine learning techniques is an alternative approach to the old traditional strategies, which are time-consuming and may not be reliable. The approach makes it necessary for a comprehensive adoption, innovative algorithms, and an understanding of the limitations and challenges. The obtained prediction is a way of understanding the underlying impact of mental illness on the health of the people and it enables us to get a healthy life expectancy. The growing impact of mental illness and the challenges associated with the detection and treatment of mental disorders make it necessary for us to understand the complete effect of it on the majority of the population. Procedia PDF Downloads 3612024 Quantum Kernel Based Regressor for Prediction of Non-Markovianity of Open Quantum Systems
Authors: Diego Tancara, Raul Coto, Ariel Norambuena, Hoseein T. Dinani, Felipe Fanchini
Abstract:
Quantum machine learning is a growing research field that aims to perform machine learning tasks assisted by a quantum computer. Kernel-based quantum machine learning models are paradigmatic examples where the kernel involves quantum states, and the Gram matrix is calculated from the overlapping between these states. With the kernel at hand, a regular machine learning model is used for the learning process. In this paper we investigate the quantum support vector machine and quantum kernel ridge models to predict the degree of non-Markovianity of a quantum system. We perform digital quantum simulation of amplitude damping and phase damping channels to create our quantum dataset. We elaborate on different kernel functions to map the data and kernel circuits to compute the overlapping between quantum states. We observe a good performance of the models.Keywords: quantum, machine learning, kernel, non-markovianity
Procedia PDF Downloads 18012023 A Study on Big Data Analytics, Applications and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 8312022 A Study on Big Data Analytics, Applications, and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9512021 Enabling Non-invasive Diagnosis of Thyroid Nodules with High Specificity and Sensitivity
Authors: Sai Maniveer Adapa, Sai Guptha Perla, Adithya Reddy P.
Abstract:
Thyroid nodules can often be diagnosed with ultrasound imaging, although differentiating between benign and malignant nodules can be challenging for medical professionals. This work suggests a novel approach to increase the precision of thyroid nodule identification by combining machine learning and deep learning. The new approach first extracts information from the ultrasound pictures using a deep learning method known as a convolutional autoencoder. A support vector machine, a type of machine learning model, is then trained using these features. With an accuracy of 92.52%, the support vector machine can differentiate between benign and malignant nodules. This innovative technique may decrease the need for pointless biopsies and increase the accuracy of thyroid nodule detection.Keywords: thyroid tumor diagnosis, ultrasound images, deep learning, machine learning, convolutional auto-encoder, support vector machine
Procedia PDF Downloads 5812020 A Deep Learning Approach to Subsection Identification in Electronic Health Records
Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan
Abstract:
Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification
Procedia PDF Downloads 21712019 Deleterious SNP’s Detection Using Machine Learning
Authors: Hamza Zidoum
Abstract:
This paper investigates the impact of human genetic variation on the function of human proteins using machine-learning algorithms. Single-Nucleotide Polymorphism represents the most common form of human genome variation. We focus on the single amino-acid polymorphism located in the coding region as they can affect the protein function leading to pathologic phenotypic change. We use several supervised Machine Learning methods to identify structural properties correlated with increased risk of the missense mutation being damaging. SVM associated with Principal Component Analysis give the best performance.Keywords: single-nucleotide polymorphism, machine learning, feature selection, SVM
Procedia PDF Downloads 37712018 R Data Science for Technology Management
Authors: Sunghae Jun
Abstract:
Technology management (TM) is important issue in a company improving the competitiveness. Among many activities of TM, technology analysis (TA) is important factor, because most decisions for management of technology are decided by the results of TA. TA is to analyze the developed results of target technology using statistics or Delphi. TA based on Delphi is depended on the experts’ domain knowledge, in comparison, TA by statistics and machine learning algorithms use objective data such as patent or paper instead of the experts’ knowledge. Many quantitative TA methods based on statistics and machine learning have been studied, and these have been used for technology forecasting, technological innovation, and management of technology. They applied diverse computing tools and many analytical methods case by case. It is not easy to select the suitable software and statistical method for given TA work. So, in this paper, we propose a methodology for quantitative TA using statistical computing software called R and data science to construct a general framework of TA. From the result of case study, we also show how our methodology is applied to real field. This research contributes to R&D planning and technology valuation in TM areas.Keywords: technology management, R system, R data science, statistics, machine learning
Procedia PDF Downloads 45712017 Injury Prediction for Soccer Players Using Machine Learning
Authors: Amiel Satvedi, Richard Pyne
Abstract:
Injuries in professional sports occur on a regular basis. Some may be minor, while others can cause huge impact on a player's career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player's number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.Keywords: injury predictor, soccer injury prevention, machine learning in soccer, big data in soccer
Procedia PDF Downloads 18212016 A Machine Learning Approach for the Leakage Classification in the Hydraulic Final Test
Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter
Abstract:
The widespread use of machine learning applications in production is significantly accelerated by improved computing power and increasing data availability. Predictive quality enables the assurance of product quality by using machine learning models as a basis for decisions on test results. The use of real Bosch production data based on geometric gauge blocks from machining, mating data from assembly and hydraulic measurement data from final testing of directional valves is a promising approach to classifying the quality characteristics of workpieces.Keywords: machine learning, classification, predictive quality, hydraulics, supervised learning
Procedia PDF Downloads 21312015 Using Machine Learning as an Alternative for Predicting Exchange Rates
Authors: Pedro Paulo Galindo Francisco, Eli Dhadad Junior
Abstract:
This study addresses the Meese-Rogoff Puzzle by introducing the latest machine learning techniques as alternatives for predicting the exchange rates. Using RMSE as a comparison metric, Meese and Rogoff discovered that economic models are unable to outperform the random walk model as short-term exchange rate predictors. Decades after this study, no statistical prediction technique has proven effective in overcoming this obstacle; although there were positive results, they did not apply to all currencies and defined periods. Recent advancements in artificial intelligence technologies have paved the way for a new approach to exchange rate prediction. Leveraging this technology, we applied five machine learning techniques to attempt to overcome the Meese-Rogoff puzzle. We considered daily data for the real, yen, British pound, euro, and Chinese yuan against the US dollar over a time horizon from 2010 to 2023. Our results showed that none of the presented techniques were able to produce an RMSE lower than the Random Walk model. However, the performance of some models, particularly LSTM and N-BEATS were able to outperform the ARIMA model. The results also suggest that machine learning models have untapped potential and could represent an effective long-term possibility for overcoming the Meese-Rogoff puzzle.Keywords: exchage rate, prediction, machine learning, deep learning
Procedia PDF Downloads 3112014 Empowering a New Frontier in Heart Disease Detection: Unleashing Quantum Machine Learning
Authors: Sadia Nasrin Tisha, Mushfika Sharmin Rahman, Javier Orduz
Abstract:
Machine learning is applied in a variety of fields throughout the world. The healthcare sector has benefited enormously from it. One of the most effective approaches for predicting human heart diseases is to use machine learning applications to classify data and predict the outcome as a classification. However, with the rapid advancement of quantum technology, quantum computing has emerged as a potential game-changer for many applications. Quantum algorithms have the potential to execute substantially faster than their classical equivalents, which can lead to significant improvements in computational performance and efficiency. In this study, we applied quantum machine learning concepts to predict coronary heart diseases from text data. We experimented thrice with three different features; and three feature sets. The data set consisted of 100 data points. We pursue to do a comparative analysis of the two approaches, highlighting the potential benefits of quantum machine learning for predicting heart diseases.Keywords: quantum machine learning, SVM, QSVM, matrix product state
Procedia PDF Downloads 9412013 Assessing the Effectiveness of Machine Learning Algorithms for Cyber Threat Intelligence Discovery from the Darknet
Authors: Azene Zenebe
Abstract:
Deep learning is a subset of machine learning which incorporates techniques for the construction of artificial neural networks and found to be useful for modeling complex problems with large dataset. Deep learning requires a very high power computational and longer time for training. By aggregating computing power, high performance computer (HPC) has emerged as an approach to resolving advanced problems and performing data-driven research activities. Cyber threat intelligence (CIT) is actionable information or insight an organization or individual uses to understand the threats that have, will, or are currently targeting the organization. Results of review of literature will be presented along with results of experimental study that compares the performance of tree-based and function-base machine learning including deep learning algorithms using secondary dataset collected from darknet.Keywords: deep-learning, cyber security, cyber threat modeling, tree-based machine learning, function-based machine learning, data science
Procedia PDF Downloads 15312012 Optimization of Machine Learning Regression Results: An Application on Health Expenditures
Authors: Songul Cinaroglu
Abstract:
Machine learning regression methods are recommended as an alternative to classical regression methods in the existence of variables which are difficult to model. Data for health expenditure is typically non-normal and have a heavily skewed distribution. This study aims to compare machine learning regression methods by hyperparameter tuning to predict health expenditure per capita. A multiple regression model was conducted and performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression recorded when different hyperparameters are assigned. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon (ε) value for Support Vector Regression was determined as hyperparameters. Study results performed by using 'k' fold cross validation changed from 5 to 50, indicate the difference between machine learning regression results in terms of R², RMSE and MAE values that are statistically significant (p < 0.001). Study results reveal that Random Forest Regression (R² ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) outperforms other machine learning regression methods. It is highly advisable to use machine learning regression methods for modelling health expenditures.Keywords: machine learning, lasso regression, random forest regression, support vector regression, hyperparameter tuning, health expenditure
Procedia PDF Downloads 22612011 Breast Cancer Prediction Using Score-Level Fusion of Machine Learning and Deep Learning Models
Authors: Sam Khozama, Ali M. Mayya
Abstract:
Breast cancer is one of the most common types in women. Early prediction of breast cancer helps physicians detect cancer in its early stages. Big cancer data needs a very powerful tool to analyze and extract predictions. Machine learning and deep learning are two of the most efficient tools for predicting cancer based on textual data. In this study, we developed a fusion model of two machine learning and deep learning models. To obtain the final prediction, Long-Short Term Memory (LSTM) and ensemble learning with hyper parameters optimization are used, and score-level fusion is used. Experiments are done on the Breast Cancer Surveillance Consortium (BCSC) dataset after balancing and grouping the class categories. Five different training scenarios are used, and the tests show that the designed fusion model improved the performance by 3.3% compared to the individual models.Keywords: machine learning, deep learning, cancer prediction, breast cancer, LSTM, fusion
Procedia PDF Downloads 16112010 Using Greywolf Optimized Machine Learning Algorithms to Improve Accuracy for Predicting Hospital Readmission for Diabetes
Authors: Vincent Liu
Abstract:
Machine learning algorithms (ML) can achieve high accuracy in predicting outcomes compared to classical models. Metaheuristic, nature-inspired algorithms can enhance traditional ML algorithms by optimizing them such as by performing feature selection. We compare ten ML algorithms to predict 30-day hospital readmission rates for diabetes patients in the US using a dataset from UCI Machine Learning Repository with feature selection performed by Greywolf nature-inspired algorithm. The baseline accuracy for the initial random forest model was 65%. After performing feature engineering, SMOTE for class balancing, and Greywolf optimization, the machine learning algorithms showed better metrics, including F1 scores, accuracy, and confusion matrix with improvements ranging in 10%-30%, and a best model of XGBoost with an accuracy of 95%. Applying machine learning this way can improve patient outcomes as unnecessary rehospitalizations can be prevented by focusing on patients that are at a higher risk of readmission.Keywords: diabetes, machine learning, 30-day readmission, metaheuristic
Procedia PDF Downloads 6112009 Machine Learning Techniques to Develop Traffic Accident Frequency Prediction Models
Authors: Rodrigo Aguiar, Adelino Ferreira
Abstract:
Road traffic accidents are the leading cause of unnatural death and injuries worldwide, representing a significant problem of road safety. In this context, the use of artificial intelligence with advanced machine learning techniques has gained prominence as a promising approach to predict traffic accidents. This article investigates the application of machine learning algorithms to develop traffic accident frequency prediction models. Models are evaluated based on performance metrics, making it possible to do a comparative analysis with traditional prediction approaches. The results suggest that machine learning can provide a powerful tool for accident prediction, which will contribute to making more informed decisions regarding road safety.Keywords: machine learning, artificial intelligence, frequency of accidents, road safety
Procedia PDF Downloads 89