Search results for: statistical machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12038

Search results for: statistical machine learning

11738 Artificial Intelligence as a User of Copyrighted Work: Descriptive Study

Authors: Dominika Collett

Abstract:

AI applications, such as machine learning, require access to a vast amount of data in the training phase, which can often be the subject of copyright protection. During later usage, the various content with which the application works can be recorded or made available on the basis of which it produces the resulting output. The EU has recently adopted new legislation to secure machine access to protected works under the DSM Directive; but, the issue of machine use of copyright works is not clearly addressed. However, such clarity is needed regarding the increasing importance of AI and its development. Therefore, this paper provides a basic background of the technology used in the development of applications in the field of computer creativity. The second part of the paper then will focus on a legal analysis of machine use of the authors' works from the perspective of existing European and Czech legislation. The main results of the paper discuss the potential collision of existing legislation in regards to machine use of works with special focus on exceptions and limitations. The legal regulation of machine use of copyright work will impact the development of AI technology.

Keywords: copyright, artificial intelligence, legal use, infringement, Czech law, EU law, text and data mining

Procedia PDF Downloads 123
11737 Adaption of the Design Thinking Method for Production Planning in the Meat Industry Using Machine Learning Algorithms

Authors: Alica Höpken, Hergen Pargmann

Abstract:

The resource-efficient planning of the complex production planning processes in the meat industry and the reduction of food waste is a permanent challenge. The complexity of the production planning process occurs in every part of the supply chain, from agriculture to the end consumer. It arises from long and uncertain planning phases. Uncertainties such as stochastic yields, fluctuations in demand, and resource variability are part of this process. In the meat industry, waste mainly relates to incorrect storage, technical causes in production, or overproduction. The high amount of food waste along the complex supply chain in the meat industry could not be reduced by simple solutions until now. Therefore, resource-efficient production planning by conventional methods is currently only partially feasible. The realization of intelligent, automated production planning is basically possible through the application of machine learning algorithms, such as those of reinforcement learning. By applying the adapted design thinking method, machine learning methods (especially reinforcement learning algorithms) are used for the complex production planning process in the meat industry. This method represents a concretization to the application area. A resource-efficient production planning process is made available by adapting the design thinking method. In addition, the complex processes can be planned efficiently by using this method, since this standardized approach offers new possibilities in order to challenge the complexity and the high time consumption. It represents a tool to support the efficient production planning in the meat industry. This paper shows an elegant adaption of the design thinking method to apply the reinforcement learning method for a resource-efficient production planning process in the meat industry. Following, the steps that are necessary to introduce machine learning algorithms into the production planning of the food industry are determined. This is achieved based on a case study which is part of the research project ”REIF - Resource Efficient, Economic and Intelligent Food Chain” supported by the German Federal Ministry for Economic Affairs and Climate Action of Germany and the German Aerospace Center. Through this structured approach, significantly better planning results are achieved, which would be too complex or very time consuming using conventional methods.

Keywords: change management, design thinking method, machine learning, meat industry, reinforcement learning, resource-efficient production planning

Procedia PDF Downloads 128
11736 Towards Human-Interpretable, Automated Learning of Feedback Control for the Mixing Layer

Authors: Hao Li, Guy Y. Cornejo Maceda, Yiqing Li, Jianguo Tan, Marek Morzynski, Bernd R. Noack

Abstract:

We propose an automated analysis of the flow control behaviour from an ensemble of control laws and associated time-resolved flow snapshots. The input may be the rich database of machine learning control (MLC) optimizing a feedback law for a cost function in the plant. The proposed methodology provides (1) insights into the control landscape, which maps control laws to performance, including extrema and ridge-lines, (2) a catalogue of representative flow states and their contribution to cost function for investigated control laws and (3) visualization of the dynamics. Key enablers are classification and feature extraction methods of machine learning. The analysis is successfully applied to the stabilization of a mixing layer with sensor-based feedback driving an upstream actuator. The fluctuation energy is reduced by 26%. The control replaces unforced Kelvin-Helmholtz vortices with subsequent vortex pairing by higher-frequency Kelvin-Helmholtz structures of lower energy. These efforts target a human interpretable, fully automated analysis of MLC identifying qualitatively different actuation regimes, distilling corresponding coherent structures, and developing a digital twin of the plant.

Keywords: machine learning control, mixing layer, feedback control, model-free control

Procedia PDF Downloads 223
11735 Attributes That Influence Respondents When Choosing a Mate in Internet Dating Sites: An Innovative Matching Algorithm

Authors: Moti Zwilling, Srečko Natek

Abstract:

This paper aims to present an innovative predictive analytics analysis in order to find the best combination between two consumers who strive to find their partner or in internet sites. The methodology shown in this paper is based on analysis of consumer preferences and involves data mining and machine learning search techniques. The study is composed of two parts: The first part examines by means of descriptive statistics the correlations between a set of parameters that are taken between man and women where they intent to meet each other through the social media, usually the internet. In this part several hypotheses were examined and statistical analysis were taken place. Results show that there is a strong correlation between the affiliated attributes of man and woman as long as concerned to how they present themselves in a social media such as "Facebook". One interesting issue is the strong desire to develop a serious relationship between most of the respondents. In the second part, the authors used common data mining algorithms to search and classify the most important and effective attributes that affect the response rate of the other side. Results exhibit that personal presentation and education background are found as most affective to achieve a positive attitude to one's profile from the other mate.

Keywords: dating sites, social networks, machine learning, decision trees, data mining

Procedia PDF Downloads 293
11734 Unsupervised Learning of Spatiotemporally Coherent Metrics

Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

Abstract:

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.

Keywords: machine learning, pattern clustering, pooling, classification

Procedia PDF Downloads 456
11733 A Pilot Study to Investigate the Use of Machine Translation Post-Editing Training for Foreign Language Learning

Authors: Hong Zhang

Abstract:

The main purpose of this study is to show that machine translation (MT) post-editing (PE) training can help our Chinese students learn Spanish as a second language. Our hypothesis is that they might make better use of it by learning PE skills specific for foreign language learning. We have developed PE training materials based on the data collected in a previous study. Training material included the special error types of the output of MT and the error types that our Chinese students studying Spanish could not detect in the experiment last year. This year we performed a pilot study in order to evaluate the PE training materials effectiveness and to what extent PE training helps Chinese students who study the Spanish language. We used screen recording to record these moments and made note of every action done by the students. Participants were speakers of Chinese with intermediate knowledge of Spanish. They were divided into two groups: Group A performed PE training and Group B did not. We prepared a Chinese text for both groups, and participants translated it by themselves (human translation), and then used Google Translate to translate the text and asked them to post-edit the raw MT output. Comparing the results of PE test, Group A could identify and correct the errors faster than Group B students, Group A did especially better in omission, word order, part of speech, terminology, mistranslation, official names, and formal register. From the results of this study, we can see that PE training can help Chinese students learn Spanish as a second language. In the future, we could focus on the students’ struggles during their Spanish studies and complete the PE training materials to teach Chinese students learning Spanish with machine translation.

Keywords: machine translation, post-editing, post-editing training, Chinese, Spanish, foreign language learning

Procedia PDF Downloads 144
11732 Machine Learning Approach for Mutation Testing

Authors: Michael Stewart

Abstract:

Mutation testing is a type of software testing proposed in the 1970s where program statements are deliberately changed to introduce simple errors so that test cases can be validated to determine if they can detect the errors. Test cases are executed against the mutant code to determine if one fails, detects the error and ensures the program is correct. One major issue with this type of testing was it became intensive computationally to generate and test all possible mutations for complex programs. This paper used reinforcement learning and parallel processing within the context of mutation testing for the selection of mutation operators and test cases that reduced the computational cost of testing and improved test suite effectiveness. Experiments were conducted using sample programs to determine how well the reinforcement learning-based algorithm performed with one live mutation, multiple live mutations and no live mutations. The experiments, measured by mutation score, were used to update the algorithm and improved accuracy for predictions. The performance was then evaluated on multiple processor computers. With reinforcement learning, the mutation operators utilized were reduced by 50 – 100%.

Keywords: automated-testing, machine learning, mutation testing, parallel processing, reinforcement learning, software engineering, software testing

Procedia PDF Downloads 198
11731 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: cellular automata, neural cellular automata, deep learning, classification

Procedia PDF Downloads 198
11730 Factors Influencing Soil Organic Carbon Storage Estimation in Agricultural Soils: A Machine Learning Approach Using Remote Sensing Data Integration

Authors: O. Sunantha, S. Zhenfeng, S. Phattraporn, A. Zeeshan

Abstract:

The decline of soil organic carbon (SOC) in global agriculture is a critical issue requiring rapid and accurate estimation for informed policymaking. While it is recognized that SOC predictors vary significantly when derived from remote sensing data and environmental variables, identifying the specific parameters most suitable for accurately estimating SOC in diverse agricultural areas remains a challenge. This study utilizes remote sensing data to precisely estimate SOC and identify influential factors in diverse agricultural areas, such as paddy, corn, sugarcane, cassava, and perennial crops. Extreme gradient boosting (XGBoost), random forest (RF), and support vector regression (SVR) models are employed to analyze these factors' impact on SOC estimation. The results show key factors influencing SOC estimation include slope, vegetation indices (EVI), spectral reflectance indices (red index, red edge2), temperature, land use, and surface soil moisture, as indicated by their averaged importance scores across XGBoost, RF, and SVR models. Therefore, using different machine learning algorithms for SOC estimation reveals varying influential factors from remote sensing data and environmental variables. This approach emphasizes feature selection, as different machine learning algorithms identify various key factors from remote sensing data and environmental variables for accurate SOC estimation.

Keywords: factors influencing SOC estimation, remote sensing data, environmental variables, machine learning

Procedia PDF Downloads 35
11729 Data Mining of Students' Performance Using Artificial Neural Network: Turkish Students as a Case Study

Authors: Samuel Nii Tackie, Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman

Abstract:

Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task; and the performances obtained from these networks evaluated in consideration of achieved recognition rates and training time.

Keywords: artificial neural network, data mining, classification, students’ evaluation

Procedia PDF Downloads 613
11728 Innovative Approaches to Water Resources Management: Addressing Challenges through Machine Learning and Remote Sensing

Authors: Abdelrahman Elsehsah, Abdelazim Negm, Eid Ashour, Mohamed Elsahabi

Abstract:

Water resources management is a critical field that encompasses the planning, development, conservation, and allocation of water resources to meet societal needs while ensuring environmental sustainability. This paper reviews the key concepts and challenges in water resources management, emphasizing the significance of a holistic approach that integrates social, economic, and environmental factors. Traditional water management practices, characterized by supply-oriented strategies and centralized control, are increasingly inadequate in addressing contemporary challenges such as water scarcity, climate change impacts, and ecosystem degradation. Emerging technologies, particularly machine learning and remote sensing, offer innovative solutions to enhance decision-making processes in water management. Machine learning algorithms facilitate accurate water demand forecasting, quality monitoring, and leak detection, while remote sensing technologies provide vital data for assessing water availability and quality. This review highlights the need for integrated water management strategies that leverage these technologies to promote sustainable practices and foster resilience in water systems. Future research should focus on improving data quality, accessibility, and the integration of diverse datasets to optimize the benefits of these technological advancements.

Keywords: water resources management, water scarcity, climate change, machine learning, remote sensing, water quality, water governance, sustainable practices, ecosystem management

Procedia PDF Downloads 6
11727 Machine Learning-Based Workflow for the Analysis of Project Portfolio

Authors: Jean Marie Tshimula, Atsushi Togashi

Abstract:

We develop a data-science approach for providing an interactive visualization and predictive models to find insights into the projects' historical data in order for stakeholders understand some unseen opportunities in the African market that might escape them behind the online project portfolio of the African Development Bank. This machine learning-based web application identifies the market trend of the fastest growing economies across the continent as well skyrocketing sectors which have a significant impact on the future of business in Africa. Owing to this, the approach is tailored to predict where the investment needs are the most required. Moreover, we create a corpus that includes the descriptions of over more than 1,200 projects that approximately cover 14 sectors designed for some of 53 African countries. Then, we sift out this large amount of semi-structured data for extracting tiny details susceptible to contain some directions to follow. In the light of the foregoing, we have applied the combination of Latent Dirichlet Allocation and Random Forests at the level of the analysis module of our methodology to highlight the most relevant topics that investors may focus on for investing in Africa.

Keywords: machine learning, topic modeling, natural language processing, big data

Procedia PDF Downloads 168
11726 Sentiment Analysis of Chinese Microblog Comments: Comparison between Support Vector Machine and Long Short-Term Memory

Authors: Xu Jiaqiao

Abstract:

Text sentiment analysis is an important branch of natural language processing. This technology is widely used in public opinion analysis and web surfing recommendations. At present, the mainstream sentiment analysis methods include three parts: sentiment analysis based on a sentiment dictionary, based on traditional machine learning, and based on deep learning. This paper mainly analyzes and compares the advantages and disadvantages of the SVM method of traditional machine learning and the Long Short-term Memory (LSTM) method of deep learning in the field of Chinese sentiment analysis, using Chinese comments on Sina Microblog as the data set. Firstly, this paper classifies and adds labels to the original comment dataset obtained by the web crawler, and then uses Jieba word segmentation to classify the original dataset and remove stop words. After that, this paper extracts text feature vectors and builds document word vectors to facilitate the training of the model. Finally, SVM and LSTM models are trained respectively. After accuracy calculation, it can be obtained that the accuracy of the LSTM model is 85.80%, while the accuracy of SVM is 91.07%. But at the same time, LSTM operation only needs 2.57 seconds, SVM model needs 6.06 seconds. Therefore, this paper concludes that: compared with the SVM model, the LSTM model is worse in accuracy but faster in processing speed.

Keywords: sentiment analysis, support vector machine, long short-term memory, Chinese microblog comments

Procedia PDF Downloads 94
11725 Potassium-Phosphorus-Nitrogen Detection and Spectral Segmentation Analysis Using Polarized Hyperspectral Imagery and Machine Learning

Authors: Nicholas V. Scott, Jack McCarthy

Abstract:

Military, law enforcement, and counter terrorism organizations are often tasked with target detection and image characterization of scenes containing explosive materials in various types of environments where light scattering intensity is high. Mitigation of this photonic noise using classical digital filtration and signal processing can be difficult. This is partially due to the lack of robust image processing methods for photonic noise removal, which strongly influence high resolution target detection and machine learning-based pattern recognition. Such analysis is crucial to the delivery of reliable intelligence. Polarization filters are a possible method for ambient glare reduction by allowing only certain modes of the electromagnetic field to be captured, providing strong scene contrast. An experiment was carried out utilizing a polarization lens attached to a hyperspectral imagery camera for the purpose of exploring the degree to which an imaged polarized scene of potassium, phosphorus, and nitrogen mixture allows for improved target detection and image segmentation. Preliminary imagery results based on the application of machine learning algorithms, including competitive leaky learning and distance metric analysis, to polarized hyperspectral imagery, suggest that polarization filters provide a slight advantage in image segmentation. The results of this work have implications for understanding the presence of explosive material in dry, desert areas where reflective glare is a significant impediment to scene characterization.

Keywords: explosive material, hyperspectral imagery, image segmentation, machine learning, polarization

Procedia PDF Downloads 142
11724 Strategic Cyber Sentinel: A Paradigm Shift in Enhancing Cybersecurity Resilience

Authors: Ayomide Oyedele

Abstract:

In the dynamic landscape of cybersecurity, "Strategic Cyber Sentinel" emerges as a revolutionary framework, transcending traditional approaches. This paper pioneers a holistic strategy, weaving together threat intelligence, machine learning, and adaptive defenses. Through meticulous real-world simulations, we demonstrate the unprecedented resilience of our framework against evolving cyber threats. "Strategic Cyber Sentinel" redefines proactive threat mitigation, offering a robust defense architecture poised for the challenges of tomorrow.

Keywords: cybersecurity, resilience, threat intelligence, machine learning, adaptive defenses

Procedia PDF Downloads 83
11723 Model Observability – A Monitoring Solution for Machine Learning Models

Authors: Amreth Chandrasehar

Abstract:

Machine Learning (ML) Models are developed and run in production to solve various use cases that help organizations to be more efficient and help drive the business. But this comes at a massive development cost and lost business opportunities. According to the Gartner report, 85% of data science projects fail, and one of the factors impacting this is not paying attention to Model Observability. Model Observability helps the developers and operators to pinpoint the model performance issues data drift and help identify root cause of issues. This paper focuses on providing insights into incorporating model observability in model development and operationalizing it in production.

Keywords: model observability, monitoring, drift detection, ML observability platform

Procedia PDF Downloads 112
11722 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach

Authors: Dongkwon Han, Sangho Kim, Sunil Kwon

Abstract:

Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.

Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance

Procedia PDF Downloads 196
11721 Intelligent Decision Support for Wind Park Operation: Machine-Learning Based Detection and Diagnosis of Anomalous Operating States

Authors: Angela Meyer

Abstract:

The operation and maintenance cost for wind parks make up a major fraction of the park’s overall lifetime cost. To minimize the cost and risk involved, an optimal operation and maintenance strategy requires continuous monitoring and analysis. In order to facilitate this, we present a decision support system that automatically scans the stream of telemetry sensor data generated from the turbines. By learning decision boundaries and normal reference operating states using machine learning algorithms, the decision support system can detect anomalous operating behavior in individual wind turbines and diagnose the involved turbine sub-systems. Operating personal can be alerted if a normal operating state boundary is exceeded. The presented decision support system and method are applicable for any turbine type and manufacturer providing telemetry data of the turbine operating state. We demonstrate the successful detection and diagnosis of anomalous operating states in a case study at a German onshore wind park comprised of Vestas V112 turbines.

Keywords: anomaly detection, decision support, machine learning, monitoring, performance optimization, wind turbines

Procedia PDF Downloads 167
11720 Building a Scalable Telemetry Based Multiclass Predictive Maintenance Model in R

Authors: Jaya Mathew

Abstract:

Many organizations are faced with the challenge of how to analyze and build Machine Learning models using their sensitive telemetry data. In this paper, we discuss how users can leverage the power of R without having to move their big data around as well as a cloud based solution for organizations willing to host their data in the cloud. By using ScaleR technology to benefit from parallelization and remote computing or R Services on premise or in the cloud, users can leverage the power of R at scale without having to move their data around.

Keywords: predictive maintenance, machine learning, big data, cloud based, on premise solution, R

Procedia PDF Downloads 379
11719 Hand Gesture Interpretation Using Sensing Glove Integrated with Machine Learning Algorithms

Authors: Aqsa Ali, Aleem Mushtaq, Attaullah Memon, Monna

Abstract:

In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.

Keywords: American sign language, assistive hand gesture interpreter, human-machine interface, machine learning, sensing glove

Procedia PDF Downloads 301
11718 Analyzing the Quality of Cloud-Based E-Learning Systems on the Perception of the Learners and the Teachers

Authors: R. W. C. Devindi, S. M. Buddika Harshanath

Abstract:

E-learning is a widely used technology for learning in the modern world. With the pandemic situation the popularity of using e-learning has been increased in a larger capacity. The e-learning educational systems require software resources as well as hardware usually but it is hard for most of the education institutions to afford those resources. Also with the massive user load e-learning has to broaden the server side resources as well. Therefore, in the present cloud computing was implemented in order to make the e – learning systems more efficient. The researcher has analyzed the quality of the e-learning systems on the perception of the learners and the teachers with the aid of hypothesis and has given the analyzed results and the discussion in this report. Therefore, the future research will be able to get some steps to increase the quality of the online learning systems furthermore. In the case of e-learning, quality assurance and cost effectiveness are essential. A complex quality assurance system is used in the stated project. There are no well-defined standard evaluation measures in this field. As a result, accurately assessing the e-learning system's overall quality is challenging. The researcher has done the analysis with the aid of standard methods and software.

Keywords: LMS–learning management system, SPSS–statistical package for social sciences (software), eigen value, hypothesis

Procedia PDF Downloads 107
11717 The Intersection of Artificial Intelligence and Mathematics

Authors: Mitat Uysal, Aynur Uysal

Abstract:

Artificial Intelligence (AI) is fundamentally driven by mathematics, with many of its core algorithms rooted in mathematical principles such as linear algebra, probability theory, calculus, and optimization techniques. This paper explores the deep connection between AI and mathematics, highlighting the role of mathematical concepts in key AI techniques like machine learning, neural networks, and optimization. To demonstrate this connection, a case study involving the implementation of a neural network using Python is presented. This practical example illustrates the essential role that mathematics plays in training a model and solving real-world problems.

Keywords: AI, mathematics, machine learning, optimization techniques, image processing

Procedia PDF Downloads 14
11716 Fraud Detection in Credit Cards with Machine Learning

Authors: Anjali Chouksey, Riya Nimje, Jahanvi Saraf

Abstract:

Online transactions have increased dramatically in this new ‘social-distancing’ era. With online transactions, Fraud in online payments has also increased significantly. Frauds are a significant problem in various industries like insurance companies, baking, etc. These frauds include leaking sensitive information related to the credit card, which can be easily misused. Due to the government also pushing online transactions, E-commerce is on a boom. But due to increasing frauds in online payments, these E-commerce industries are suffering a great loss of trust from their customers. These companies are finding credit card fraud to be a big problem. People have started using online payment options and thus are becoming easy targets of credit card fraud. In this research paper, we will be discussing machine learning algorithms. We have used a decision tree, XGBOOST, k-nearest neighbour, logistic-regression, random forest, and SVM on a dataset in which there are transactions done online mode using credit cards. We will test all these algorithms for detecting fraud cases using the confusion matrix, F1 score, and calculating the accuracy score for each model to identify which algorithm can be used in detecting frauds.

Keywords: machine learning, fraud detection, artificial intelligence, decision tree, k nearest neighbour, random forest, XGBOOST, logistic regression, support vector machine

Procedia PDF Downloads 148
11715 Prediction of Mental Health: Heuristic Subjective Well-Being Model on Perceived Stress Scale

Authors: Ahmet Karakuş, Akif Can Kilic, Emre Alptekin

Abstract:

A growing number of studies have been conducted to determine how well-being may be predicted using well-designed models. It is necessary to investigate the backgrounds of features in order to construct a viable Subjective Well-Being (SWB) model. We have picked the suitable variables from the literature on SWB that are acceptable for real-world data instructions. The goal of this work is to evaluate the model by feeding it with SWB characteristics and then categorizing the stress levels using machine learning methods to see how well it performs on a real dataset. Despite the fact that it is a multiclass classification issue, we have achieved significant metric scores, which may be taken into account for a specific task.

Keywords: machine learning, multiclassification problem, subjective well-being, perceived stress scale

Procedia PDF Downloads 131
11714 Machine Learning for Disease Prediction Using Symptoms and X-Ray Images

Authors: Ravija Gunawardana, Banuka Athuraliya

Abstract:

Machine learning has emerged as a powerful tool for disease diagnosis and prediction. The use of machine learning algorithms has the potential to improve the accuracy of disease prediction, thereby enabling medical professionals to provide more effective and personalized treatments. This study focuses on developing a machine-learning model for disease prediction using symptoms and X-ray images. The importance of this study lies in its potential to assist medical professionals in accurately diagnosing diseases, thereby improving patient outcomes. Respiratory diseases are a significant cause of morbidity and mortality worldwide, and chest X-rays are commonly used in the diagnosis of these diseases. However, accurately interpreting X-ray images requires significant expertise and can be time-consuming, making it difficult to diagnose respiratory diseases in a timely manner. By incorporating machine learning algorithms, we can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The study utilized the Mask R-CNN algorithm, which is a state-of-the-art method for object detection and segmentation in images, to process chest X-ray images. The model was trained and tested on a large dataset of patient information, which included both symptom data and X-ray images. The performance of the model was evaluated using a range of metrics, including accuracy, precision, recall, and F1-score. The results showed that the model achieved an accuracy rate of over 90%, indicating that it was able to accurately detect and segment regions of interest in the X-ray images. In addition to X-ray images, the study also incorporated symptoms as input data for disease prediction. The study used three different classifiers, namely Random Forest, K-Nearest Neighbor and Support Vector Machine, to predict diseases based on symptoms. These classifiers were trained and tested using the same dataset of patient information as the X-ray model. The results showed promising accuracy rates for predicting diseases using symptoms, with the ensemble learning techniques significantly improving the accuracy of disease prediction. The study's findings indicate that the use of machine learning algorithms can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The model developed in this study has the potential to assist medical professionals in diagnosing respiratory diseases more accurately and efficiently. However, it is important to note that the accuracy of the model can be affected by several factors, including the quality of the X-ray images, the size of the dataset used for training, and the complexity of the disease being diagnosed. In conclusion, the study demonstrated the potential of machine learning algorithms for disease prediction using symptoms and X-ray images. The use of these algorithms can improve the accuracy of disease diagnosis, ultimately leading to better patient care. Further research is needed to validate the model's accuracy and effectiveness in a clinical setting and to expand its application to other diseases.

Keywords: K-nearest neighbor, mask R-CNN, random forest, support vector machine

Procedia PDF Downloads 154
11713 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 56
11712 A Comparison of YOLO Family for Apple Detection and Counting in Orchards

Authors: Yuanqing Li, Changyi Lei, Zhaopeng Xue, Zhuo Zheng, Yanbo Long

Abstract:

In agricultural production and breeding, implementing automatic picking robot in orchard farming to reduce human labour and error is challenging. The core function of it is automatic identification based on machine vision. This paper focuses on apple detection and counting in orchards and implements several deep learning methods. Extensive datasets are used and a semi-automatic annotation method is proposed. The proposed deep learning models are in state-of-the-art YOLO family. In view of the essence of the models with various backbones, a multi-dimensional comparison in details is made in terms of counting accuracy, mAP and model memory, laying the foundation for realising automatic precision agriculture.

Keywords: agricultural object detection, deep learning, machine vision, YOLO family

Procedia PDF Downloads 197
11711 The Asymmetric Proximal Support Vector Machine Based on Multitask Learning for Classification

Authors: Qing Wu, Fei-Yan Li, Heng-Chang Zhang

Abstract:

Multitask learning support vector machines (SVMs) have recently attracted increasing research attention. Given several related tasks, the single-task learning methods trains each task separately and ignore the inner cross-relationship among tasks. However, multitask learning can capture the correlation information among tasks and achieve better performance by training all tasks simultaneously. In addition, the asymmetric squared loss function can better improve the generalization ability of the models on the most asymmetric distributed data. In this paper, we first make two assumptions on the relatedness among tasks and propose two multitask learning proximal support vector machine algorithms, named MTL-a-PSVM and EMTL-a-PSVM, respectively. MTL-a-PSVM seeks a trade-off between the maximum expectile distance for each task model and the closeness of each task model to the general model. As an extension of the MTL-a-PSVM, EMTL-a-PSVM can select appropriate kernel functions for shared information and private information. Besides, two corresponding special cases named MTL-PSVM and EMTLPSVM are proposed by analyzing the asymmetric squared loss function, which can be easily implemented by solving linear systems. Experimental analysis of three classification datasets demonstrates the effectiveness and superiority of our proposed multitask learning algorithms.

Keywords: multitask learning, asymmetric squared loss, EMTL-a-PSVM, classification

Procedia PDF Downloads 134
11710 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: apartment complex, big data, life-cycle building value analysis, machine learning

Procedia PDF Downloads 374
11709 Assessing Online Learning Paths in an Learning Management Systems Using a Data Mining and Machine Learning Approach

Authors: Alvaro Figueira, Bruno Cabral

Abstract:

Nowadays, students are used to be assessed through an online platform. Educators have stepped up from a period in which they endured the transition from paper to digital. The use of a diversified set of question types that range from quizzes to open questions is currently common in most university courses. In many courses, today, the evaluation methodology also fosters the students’ online participation in forums, the download, and upload of modified files, or even the participation in group activities. At the same time, new pedagogy theories that promote the active participation of students in the learning process, and the systematic use of problem-based learning, are being adopted using an eLearning system for that purpose. However, although there can be a lot of feedback from these activities to student’s, usually it is restricted to the assessments of online well-defined tasks. In this article, we propose an automatic system that informs students of abnormal deviations of a 'correct' learning path in the course. Our approach is based on the fact that by obtaining this information earlier in the semester, may provide students and educators an opportunity to resolve an eventual problem regarding the student’s current online actions towards the course. Our goal is to prevent situations that have a significant probability to lead to a poor grade and, eventually, to failing. In the major learning management systems (LMS) currently available, the interaction between the students and the system itself is registered in log files in the form of registers that mark beginning of actions performed by the user. Our proposed system uses that logged information to derive new one: the time each student spends on each activity, the time and order of the resources used by the student and, finally, the online resource usage pattern. Then, using the grades assigned to the students in previous years, we built a learning dataset that is used to feed a machine learning meta classifier. The produced classification model is then used to predict the grades a learning path is heading to, in the current year. Not only this approach serves the teacher, but also the student to receive automatic feedback on her current situation, having past years as a perspective. Our system can be applied to online courses that integrate the use of an online platform that stores user actions in a log file, and that has access to other student’s evaluations. The system is based on a data mining process on the log files and on a self-feedback machine learning algorithm that works paired with the Moodle LMS.

Keywords: data mining, e-learning, grade prediction, machine learning, student learning path

Procedia PDF Downloads 122