Search results for: machine learning invariants
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8215

Search results for: machine learning invariants

7855 A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs Based on Machine Learning Algorithms

Authors: Kottaridi Klimentia, Demopoulos Vasilis, Sidiropoulos Anastasios, Ihara Diego, Nikolaidis Vasileios, Antonopoulos Dimitrios

Abstract:

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity, and aflatoxinogenic capacity of the strains, topography, soil, and climate parameters of the fig orchards, are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive, and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), the concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P, and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques, i.e., dimensionality reduction on the original dataset (principal component analysis), metric learning (Mahalanobis metric for clustering), and k-nearest neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson correlation coefficient (PCC) between observed and predicted values.

Keywords: aflatoxins, Aspergillus spp., dried figs, k-nearest neighbors, machine learning, prediction

Procedia PDF Downloads 152
7854 E-Learning Platform for School Kids

Authors: Gihan Thilakarathna, Fernando Ishara, Rathnayake Yasith, Bandara A. M. R. Y.

Abstract:

E-learning is a crucial component of intelligent education. Even in the midst of a pandemic, E-learning is becoming increasingly important in the educational system. Several e-learning programs are accessible for students. Here, we decided to create an e-learning framework for children. We've found a few issues that teachers are having with their online classes. When there are numerous students in an online classroom, how does a teacher recognize a student's focus on academics and below-the-surface behaviors? Some kids are not paying attention in class, and others are napping. The teacher is unable to keep track of each and every student. Key challenge in e-learning is online exams. Because students can cheat easily during online exams. Hence there is need of exam proctoring is occurred. In here we propose an automated online exam cheating detection method using a web camera. The purpose of this project is to present an E-learning platform for math education and include games for kids as an alternative teaching method for math students. The game will be accessible via a web browser. The imagery in the game is drawn in a cartoonish style. This will help students learn math through games. Everything in this day and age is moving towards automation. However, automatic answer evaluation is only available for MCQ-based questions. As a result, the checker has a difficult time evaluating the theory solution. The current system requires more manpower and takes a long time to evaluate responses. It's also possible to mark two identical responses differently and receive two different grades. As a result, this application employs machine learning techniques to provide an automatic evaluation of subjective responses based on the keyword provided to the computer as student input, resulting in a fair distribution of marks. In addition, it will save time and manpower. We used deep learning, machine learning, image processing and natural language technologies to develop these research components.

Keywords: math, education games, e-learning platform, artificial intelligence

Procedia PDF Downloads 129
7853 Neural Networks and Genetic Algorithms Approach for Word Correction and Prediction

Authors: Rodrigo S. Fonseca, Antônio C. P. Veiga

Abstract:

Aiming at helping people with some movement limitation that makes typing and communication difficult, there is a need to customize an assistive tool with a learning environment that helps the user in order to optimize text input, identifying the error and providing the correction and possibilities of choice in the Portuguese language. The work presents an Orthographic and Grammatical System that can be incorporated into writing environments, improving and facilitating the use of an alphanumeric keyboard, using a prototype built using a genetic algorithm in addition to carrying out the prediction, which can occur based on the quantity and position of the inserted letters and even placement in the sentence, ensuring the sequence of ideas using a Long Short Term Memory (LSTM) neural network. The prototype optimizes data entry, being a component of assistive technology for the textual formulation, detecting errors, seeking solutions and informing the user of accurate predictions quickly and effectively through machine learning.

Keywords: genetic algorithm, neural networks, word prediction, machine learning

Procedia PDF Downloads 167
7852 Presenting a Model Based on Artificial Neural Networks to Predict the Execution Time of Design Projects

Authors: Hamed Zolfaghari, Mojtaba Kord

Abstract:

After feasibility study the design phase is started and the rest of other phases are highly dependent on this phase. forecasting the duration of design phase could do a miracle and would save a lot of time. This study provides a fast and accurate Machine learning (ML) and optimization framework, which allows a quick duration estimation of project design phase, hence improving operational efficiency and competitiveness of a design construction company. 3 data sets of three years composed of daily time spent for different design projects are used to train and validate the ML models to perform multiple projects. Our study concluded that Artificial Neural Network (ANN) performed an accuracy of 0.94.

Keywords: time estimation, machine learning, Artificial neural network, project design phase

Procedia PDF Downloads 59
7851 Load Forecasting in Microgrid Systems with R and Cortana Intelligence Suite

Authors: F. Lazzeri, I. Reiter

Abstract:

Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.

Keywords: time-series, features engineering methods for forecasting, energy demand forecasting, Azure Machine Learning

Procedia PDF Downloads 277
7850 Implementation of Data Science in Field of Homologation

Authors: Shubham Bhonde, Nekzad Doctor, Shashwat Gawande

Abstract:

For the use and the import of Keys and ID Transmitter as well as Body Control Modules with radio transmission in a lot of countries, homologation is required. Final deliverables in homologation of the product are certificates. In considering the world of homologation, there are approximately 200 certificates per product, with most of the certificates in local languages. It is challenging to manually investigate each certificate and extract relevant data from the certificate, such as expiry date, approval date, etc. It is most important to get accurate data from the certificate as inaccuracy may lead to missing re-homologation of certificates that will result in an incompliance situation. There is a scope of automation in reading the certificate data in the field of homologation. We are using deep learning as a tool for automation. We have first trained a model using machine learning by providing all country's basic data. We have trained this model only once. We trained the model by feeding pdf and jpg files using the ETL process. Eventually, that trained model will give more accurate results later. As an outcome, we will get the expiry date and approval date of the certificate with a single click. This will eventually help to implement automation features on a broader level in the database where certificates are stored. This automation will help to minimize human error to almost negligible.

Keywords: homologation, re-homologation, data science, deep learning, machine learning, ETL (extract transform loading)

Procedia PDF Downloads 140
7849 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies

Authors: Kanika Sood, Sijie Shang

Abstract:

A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.

Keywords: bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, natural language processing, online learning, sentiment analysis, teaching pedagogy

Procedia PDF Downloads 51
7848 Energy Consumption Optimization of Electric Vehicle by Using Machine Learning: A Comparative Literature Review and Lessons Learned

Authors: Sholeh Motaghian, Pekka Toivanen, Keiji Haataja

Abstract:

The swift expansion of the transportation industry and its associated emissions have captured the focus of policymakers who are dedicated to upholding ecological sustainability. As a result, understanding the key contributors to transportation emissions is of utmost significance. Amidst the escalating transportation emissions, the significance of electric vehicles cannot be overstated. Electric vehicles play a critical role in steering us towards a low-carbon economy and a sustainable ecological setting. The effective integration of electric vehicles hinges on the development of energy consumption models capable of accurately and efficiently predicting energy usage. Enhancing the energy efficiency of electric vehicles will play a pivotal role in reducing driver concerns and establishing a vital framework for the efficient operation, planning, and management of charging infrastructure. In this article, the works done in this field are reviewed, and the advantages and disadvantages of each are stated.

Keywords: deep learning, electrical vehicle, energy consumption, machine learning, smart grid

Procedia PDF Downloads 43
7847 Validating Condition-Based Maintenance Algorithms through Simulation

Authors: Marcel Chevalier, Léo Dupont, Sylvain Marié, Frédérique Roffet, Elena Stolyarova, William Templier, Costin Vasile

Abstract:

Industrial end-users are currently facing an increasing need to reduce the risk of unexpected failures and optimize their maintenance. This calls for both short-term analysis and long-term ageing anticipation. At Schneider Electric, we tackle those two issues using both machine learning and first principles models. Machine learning models are incrementally trained from normal data to predict expected values and detect statistically significant short-term deviations. Ageing models are constructed by breaking down physical systems into sub-assemblies, then determining relevant degradation modes and associating each one to the right kinetic law. Validating such anomaly detection and maintenance models is challenging, both because actual incident and ageing data are rare and distorted by human interventions, and incremental learning depends on human feedback. To overcome these difficulties, we propose to simulate physics, systems, and humans -including asset maintenance operations- in order to validate the overall approaches in accelerated time and possibly choose between algorithmic alternatives.

Keywords: degradation models, ageing, anomaly detection, soft sensor, incremental learning

Procedia PDF Downloads 101
7846 Internet-Based Architecture for Machine-to-Machine Communication of a Public Security Network

Authors: Ogwueleka Francisca Nonyelum, Jiya Muhammad

Abstract:

Poor communication between the victims of the burglaries, road and fire accidents and the agencies, and lack of quick emergency response by the agencies is solved through Machine-to-Machine (M2M) communication. A distress caller is expected to make a call through a network to the respective agency for emergency response but due to some challenges, this often becomes arduous and futile. This research puts forth an Internet-based architecture for Machine-to-Machine (M2M) communication to enhance information dissemination in National Public Security Communication System (NPSCS) network. M2M enables the flow of data between machines and machines and ultimately machines and people with information flowing from a machine over a network, and then through a gateway to a system where it is reviewed and acted on. The research findings showed that Internet-based architecture for M2M communication is most suitable for deployment of a public security network which will allow machines to use Internet to talk to each other.

Keywords: machine-to-machine (M2M), internet-based architecture, network, gateway

Procedia PDF Downloads 453
7845 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 103
7844 Leveraging Learning Analytics to Inform Learning Design in Higher Education

Authors: Mingming Jiang

Abstract:

This literature review aims to offer an overview of existing research on learning analytics and learning design, the alignment between the two, and how learning analytics has been leveraged to inform learning design in higher education. Current research suggests a need to create more alignment and integration between learning analytics and learning design in order to not only ground learning analytics on learning sciences but also enable data-driven decisions in learning design to improve learning outcomes. In addition, multiple conceptual frameworks have been proposed to enhance the synergy and alignment between learning analytics and learning design. Future research should explore this synergy further in the unique context of higher education, identifying learning analytics metrics in higher education that can offer insight into learning processes, evaluating the effect of learning analytics outcomes on learning design decision-making in higher education, and designing learning environments in higher education that make the capturing and deployment of learning analytics outcomes more efficient.

Keywords: learning analytics, learning design, big data in higher education, online learning environments

Procedia PDF Downloads 134
7843 Distributed Cyber Physical Secure Framework for DC Microgrids: DC Ship Power System Applications

Authors: Grace karimi Muriithi, Behnaz Papari, Ali Arsalan, Christopher Shannon Edrington

Abstract:

Complexity and nonlinearity of the control system design is increasing for DC microgrid applications when the cyber concept associated with the technology constraints will added to the picture. Controllers’ functionality during the critical operation mode is required to guaranteed specifically for a high profile applications such as NAVY DC ship power system (SPS) as an small-scaled DC microgrid. Thus, SPS is susceptible to cyber-attacks and, accordingly, can provide the disastrous effects. In this study, a machine learning (ML) approach is demonstrated to offer the promising performance of SPS for developing an effective and robust functionality over attacks time. Simulation results analysis demonstrate that the proposed method can improve the controllability successfully.

Keywords: controlability, cyber attacks, distribute control, machine learning

Procedia PDF Downloads 76
7842 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 20
7841 Cognition of Driving Context for Driving Assistance

Authors: Manolo Dulva Hina, Clement Thierry, Assia Soukane, Amar Ramdane-Cherif

Abstract:

In this paper, we presented our innovative way of determining the driving context for a driving assistance system. We invoke the fusion of all parameters that describe the context of the environment, the vehicle and the driver to obtain the driving context. We created a training set that stores driving situation patterns and from which the system consults to determine the driving situation. A machine-learning algorithm predicts the driving situation. The driving situation is an input to the fission process that yields the action that must be implemented when the driver needs to be informed or assisted from the given the driving situation. The action may be directed towards the driver, the vehicle or both. This is an ongoing work whose goal is to offer an alternative driving assistance system for safe driving, green driving and comfortable driving. Here, ontologies are used for knowledge representation.

Keywords: cognitive driving, intelligent transportation system, multimodal system, ontology, machine learning

Procedia PDF Downloads 338
7840 Liquid Biopsy Based Microbial Biomarker in Coronary Artery Disease Diagnosis

Authors: Eyup Ozkan, Ozkan U. Nalbantoglu, Aycan Gundogdu, Mehmet Hora, A. Emre Onuk

Abstract:

The human microbiome has been associated with cardiological conditions and this relationship is becoming to be defined beyond the gastrointestinal track. In this study, we investigate the alteration in circulatory microbiota in the context of Coronary Artery Disease (CAD). We received circulatory blood samples from suspected CAD patients and maintain 16S ribosomal RNA sequencing to identify each patient’s microbiome. It was found that Corynebacterium and Methanobacteria genera show statistically significant differences between healthy and CAD patients. The overall biodiversities between the groups were observed to be different revealed by machine learning classification models. We also achieve and demonstrate the performance of a diagnostic method using circulatory blood microbiome-based estimation.

Keywords: coronary artery disease, blood microbiome, machine learning, angiography, next-generation sequencing

Procedia PDF Downloads 132
7839 Application of Machine Learning Techniques in Forest Cover-Type Prediction

Authors: Saba Ebrahimi, Hedieh Ashrafi

Abstract:

Predicting the cover type of forests is a challenge for natural resource managers. In this project, we aim to perform a comprehensive comparative study of two well-known classification methods, support vector machine (SVM) and decision tree (DT). The comparison is first performed among different types of each classifier, and then the best of each classifier will be compared by considering different evaluation metrics. The effect of boosting and bagging for decision trees is also explored. Furthermore, the effect of principal component analysis (PCA) and feature selection is also investigated. During the project, the forest cover-type dataset from the remote sensing and GIS program is used in all computations.

Keywords: classification methods, support vector machine, decision tree, forest cover-type dataset

Procedia PDF Downloads 184
7838 Reinforcement Learning Optimization: Unraveling Trends and Advancements in Metaheuristic Algorithms

Authors: Rahul Paul, Kedar Nath Das

Abstract:

The field of machine learning (ML) is experiencing rapid development, resulting in a multitude of theoretical advancements and extensive practical implementations across various disciplines. The objective of ML is to facilitate the ability of machines to perform cognitive tasks by leveraging knowledge gained from prior experiences and effectively addressing complex problems, even in situations that deviate from previously encountered instances. Reinforcement Learning (RL) has emerged as a prominent subfield within ML and has gained considerable attention in recent times from researchers. This surge in interest can be attributed to the practical applications of RL, the increasing availability of data, and the rapid advancements in computing power. At the same time, optimization algorithms play a pivotal role in the field of ML and have attracted considerable interest from researchers. A multitude of proposals have been put forth to address optimization problems or improve optimization techniques within the domain of ML. The necessity of a thorough examination and implementation of optimization algorithms within the context of ML is of utmost importance in order to provide guidance for the advancement of research in both optimization and ML. This article provides a comprehensive overview of the application of metaheuristic evolutionary optimization algorithms in conjunction with RL to address a diverse range of scientific challenges. Furthermore, this article delves into the various challenges and unresolved issues pertaining to the optimization of RL models.

Keywords: machine learning, reinforcement learning, loss function, evolutionary optimization techniques

Procedia PDF Downloads 53
7837 Short-Term Forecast of Wind Turbine Production with Machine Learning Methods: Direct Approach and Indirect Approach

Authors: Mamadou Dione, Eric Matzner-lober, Philippe Alexandre

Abstract:

The Energy Transition Act defined by the French State has precise implications on Renewable Energies, in particular on its remuneration mechanism. Until then, a purchase obligation contract permitted the sale of wind-generated electricity at a fixed rate. Tomorrow, it will be necessary to sell this electricity on the Market (at variable rates) before obtaining additional compensation intended to reduce the risk. This sale on the market requires to announce in advance (about 48 hours before) the production that will be delivered on the network, so to be able to predict (in the short term) this production. The fundamental problem remains the variability of the Wind accentuated by the geographical situation. The objective of the project is to provide, every day, short-term forecasts (48-hour horizon) of wind production using weather data. The predictions of the GFS model and those of the ECMWF model are used as explanatory variables. The variable to be predicted is the production of a wind farm. We do two approaches: a direct approach that predicts wind generation directly from weather data, and an integrated approach that estimâtes wind from weather data and converts it into wind power by power curves. We used machine learning techniques to predict this production. The models tested are random forests, CART + Bagging, CART + Boosting, SVM (Support Vector Machine). The application is made on a wind farm of 22MW (11 wind turbines) of the Compagnie du Vent (that became Engie Green France). Our results are very conclusive compared to the literature.

Keywords: forecast aggregation, machine learning, spatio-temporal dynamics modeling, wind power forcast

Procedia PDF Downloads 194
7836 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 306
7835 Market Index Trend Prediction using Deep Learning and Risk Analysis

Authors: Shervin Alaei, Reza Moradi

Abstract:

Trading in financial markets is subject to risks due to their high volatilities. Here, using an LSTM neural network, and by doing some risk-based feature engineering tasks, we developed a method that can accurately predict trends of the Tehran stock exchange market index from a few days ago. Our test results have shown that the proposed method with an average prediction accuracy of more than 94% is superior to the other common machine learning algorithms. To the best of our knowledge, this is the first work incorporating deep learning and risk factors to accurately predict market trends.

Keywords: deep learning, LSTM, trend prediction, risk management, artificial neural networks

Procedia PDF Downloads 120
7834 Land Suitability Prediction Modelling for Agricultural Crops Using Machine Learning Approach: A Case Study of Khuzestan Province, Iran

Authors: Saba Gachpaz, Hamid Reza Heidari

Abstract:

The sharp increase in population growth leads to more pressure on agricultural areas to satisfy the food supply. To achieve this, more resources should be consumed and, besides other environmental concerns, highlight sustainable agricultural development. Land-use management is a crucial factor in obtaining optimum productivity. Machine learning is a widely used technique in the agricultural sector, from yield prediction to customer behavior. This method focuses on learning and provides patterns and correlations from our data set. In this study, nine physical control factors, namely, soil classification, electrical conductivity, normalized difference water index (NDWI), groundwater level, elevation, annual precipitation, pH of water, annual mean temperature, and slope in the alluvial plain in Khuzestan (an agricultural hotspot in Iran) are used to decide the best agricultural land use for both rainfed and irrigated agriculture for ten different crops. For this purpose, each variable was imported into Arc GIS, and a raster layer was obtained. In the next level, by using training samples, all layers were imported into the python environment. A random forest model was applied, and the weight of each variable was specified. In the final step, results were visualized using a digital elevation model, and the importance of all factors for each one of the crops was obtained. Our results show that despite 62% of the study area being allocated to agricultural purposes, only 42.9% of these areas can be defined as a suitable class for cultivation purposes.

Keywords: land suitability, machine learning, random forest, sustainable agriculture

Procedia PDF Downloads 55
7833 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: convolutional neural networks, coffee bean, peaberry, sorting, support vector machine

Procedia PDF Downloads 121
7832 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine

Procedia PDF Downloads 485
7831 Software Defect Analysis- Eclipse Dataset

Authors: Amrane Meriem, Oukid Salyha

Abstract:

The presence of defects or bugs in software can lead to costly setbacks, operational inefficiencies, and compromised user experiences. The integration of Machine Learning(ML) techniques has emerged to predict and preemptively address software defects. ML represents a proactive strategy aimed at identifying potential anomalies, errors, or vulnerabilities within code before they manifest as operational issues. By analyzing historical data, such as code changes, feature im- plementations, and defect occurrences. This en- ables development teams to anticipate and mitigate these issues, thus enhancing software quality, reducing maintenance costs, and ensuring smoother user interactions. In this work, we used a recommendation system to improve the performance of ML models in terms of predicting the code severity and effort estimation.

Keywords: software engineering, machine learning, bugs detection, effort estimation

Procedia PDF Downloads 54
7830 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer

Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom

Abstract:

Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.

Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN

Procedia PDF Downloads 46
7829 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis

Authors: Mehrnaz Mostafavi

Abstract:

The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.

Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans

Procedia PDF Downloads 47
7828 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 299
7827 Machine Learning and Metaheuristic Algorithms in Short Femoral Stem Custom Design to Reduce Stress Shielding

Authors: Isabel Moscol, Carlos J. Díaz, Ciro Rodríguez

Abstract:

Hip replacement becomes necessary when a person suffers severe pain or considerable functional limitations and the best option to enhance their quality of life is through the replacement of the damaged joint. One of the main components in femoral prostheses is the stem which distributes the loads from the joint to the proximal femur. To preserve more bone stock and avoid weakening of the diaphysis, a short starting stem was selected, generated from the intramedullary morphology of the patient's femur. It ensures the implantability of the design and leads to geometric delimitation for personalized optimization with machine learning (ML) and metaheuristic algorithms. The present study attempts to design a cementless short stem to make the strain deviation before and after implantation close to zero, promoting its fixation and durability. Regression models developed to estimate the percentage change of maximum principal stresses were used as objective optimization functions by the metaheuristic algorithm. The latter evaluated different geometries of the short stem with the modification of certain parameters in oblique sections from the osteotomy plane. The optimized geometry reached a global stress shielding (SS) of 18.37% with a determination factor (R²) of 0.667. The predicted results favour implantability integration in the short stem optimization to effectively reduce SS in the proximal femur.

Keywords: machine learning techniques, metaheuristic algorithms, short-stem design, stress shielding, hip replacement

Procedia PDF Downloads 170
7826 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection

Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada

Abstract:

With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.

Keywords: machine learning, imbalanced data, data mining, big data

Procedia PDF Downloads 106