Search results for: physics guided machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9138

Search results for: physics guided machine learning

8808 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 31
8807 Online Learning for Modern Business Models: Theoretical Considerations and Algorithms

Authors: Marian Sorin Ionescu, Olivia Negoita, Cosmin Dobrin

Abstract:

This scientific communication reports and discusses learning models adaptable to modern business problems and models specific to digital concepts and paradigms. In the PAC (probably approximately correct) learning model approach, in which the learning process begins by receiving a batch of learning examples, the set of learning processes is used to acquire a hypothesis, and when the learning process is fully used, this hypothesis is used in the prediction of new operational examples. For complex business models, a lot of models should be introduced and evaluated to estimate the induced results so that the totality of the results are used to develop a predictive rule, which anticipates the choice of new models. In opposition, for online learning-type processes, there is no separation between the learning (training) and predictive phase. Every time a business model is approached, a test example is considered from the beginning until the prediction of the appearance of a model considered correct from the point of view of the business decision. After choosing choice a part of the business model, the label with the logical value "true" is known. Some of the business models are used as examples of learning (training), which helps to improve the prediction mechanisms for future business models.

Keywords: machine learning, business models, convex analysis, online learning

Procedia PDF Downloads 122
8806 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 69
8805 A Comparison of YOLO Family for Apple Detection and Counting in Orchards

Authors: Yuanqing Li, Changyi Lei, Zhaopeng Xue, Zhuo Zheng, Yanbo Long

Abstract:

In agricultural production and breeding, implementing automatic picking robot in orchard farming to reduce human labour and error is challenging. The core function of it is automatic identification based on machine vision. This paper focuses on apple detection and counting in orchards and implements several deep learning methods. Extensive datasets are used and a semi-automatic annotation method is proposed. The proposed deep learning models are in state-of-the-art YOLO family. In view of the essence of the models with various backbones, a multi-dimensional comparison in details is made in terms of counting accuracy, mAP and model memory, laying the foundation for realising automatic precision agriculture.

Keywords: agricultural object detection, deep learning, machine vision, YOLO family

Procedia PDF Downloads 173
8804 The Asymmetric Proximal Support Vector Machine Based on Multitask Learning for Classification

Authors: Qing Wu, Fei-Yan Li, Heng-Chang Zhang

Abstract:

Multitask learning support vector machines (SVMs) have recently attracted increasing research attention. Given several related tasks, the single-task learning methods trains each task separately and ignore the inner cross-relationship among tasks. However, multitask learning can capture the correlation information among tasks and achieve better performance by training all tasks simultaneously. In addition, the asymmetric squared loss function can better improve the generalization ability of the models on the most asymmetric distributed data. In this paper, we first make two assumptions on the relatedness among tasks and propose two multitask learning proximal support vector machine algorithms, named MTL-a-PSVM and EMTL-a-PSVM, respectively. MTL-a-PSVM seeks a trade-off between the maximum expectile distance for each task model and the closeness of each task model to the general model. As an extension of the MTL-a-PSVM, EMTL-a-PSVM can select appropriate kernel functions for shared information and private information. Besides, two corresponding special cases named MTL-PSVM and EMTLPSVM are proposed by analyzing the asymmetric squared loss function, which can be easily implemented by solving linear systems. Experimental analysis of three classification datasets demonstrates the effectiveness and superiority of our proposed multitask learning algorithms.

Keywords: multitask learning, asymmetric squared loss, EMTL-a-PSVM, classification

Procedia PDF Downloads 91
8803 Assessing Online Learning Paths in an Learning Management Systems Using a Data Mining and Machine Learning Approach

Authors: Alvaro Figueira, Bruno Cabral

Abstract:

Nowadays, students are used to be assessed through an online platform. Educators have stepped up from a period in which they endured the transition from paper to digital. The use of a diversified set of question types that range from quizzes to open questions is currently common in most university courses. In many courses, today, the evaluation methodology also fosters the students’ online participation in forums, the download, and upload of modified files, or even the participation in group activities. At the same time, new pedagogy theories that promote the active participation of students in the learning process, and the systematic use of problem-based learning, are being adopted using an eLearning system for that purpose. However, although there can be a lot of feedback from these activities to student’s, usually it is restricted to the assessments of online well-defined tasks. In this article, we propose an automatic system that informs students of abnormal deviations of a 'correct' learning path in the course. Our approach is based on the fact that by obtaining this information earlier in the semester, may provide students and educators an opportunity to resolve an eventual problem regarding the student’s current online actions towards the course. Our goal is to prevent situations that have a significant probability to lead to a poor grade and, eventually, to failing. In the major learning management systems (LMS) currently available, the interaction between the students and the system itself is registered in log files in the form of registers that mark beginning of actions performed by the user. Our proposed system uses that logged information to derive new one: the time each student spends on each activity, the time and order of the resources used by the student and, finally, the online resource usage pattern. Then, using the grades assigned to the students in previous years, we built a learning dataset that is used to feed a machine learning meta classifier. The produced classification model is then used to predict the grades a learning path is heading to, in the current year. Not only this approach serves the teacher, but also the student to receive automatic feedback on her current situation, having past years as a perspective. Our system can be applied to online courses that integrate the use of an online platform that stores user actions in a log file, and that has access to other student’s evaluations. The system is based on a data mining process on the log files and on a self-feedback machine learning algorithm that works paired with the Moodle LMS.

Keywords: data mining, e-learning, grade prediction, machine learning, student learning path

Procedia PDF Downloads 103
8802 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: apartment complex, big data, life-cycle building value analysis, machine learning

Procedia PDF Downloads 353
8801 The Impact of E-Learning on the Performance of History Learners in Eswatini General Certificate of Secondary Education

Authors: Joseph Osodo, Motsa Thobekani Phila

Abstract:

The study investigated the impact of e-learning on the performance of history learners in Eswatini general certificate of secondary education in the Manzini region of Eswatini. The study was guided by the theory of connectivism. The study had three objectives which were to find out the significance of e-learning during the COVID-19 era in learning History subject; challenges faced by history teachers’ and learners’ in e-learning; and how the challenges were mitigated. The study used a qualitative research approach and descriptive research design. Purposive sampling was used to select eight History teachers and eight History learners from four secondary schools in the Manzini region. Data were collected using face to face interviews. The collected data were analyzed and presented in thematically. The findings showed that history teachers had good knowledge on what e-learning was, while students had little understanding of e-learning. Some of the forms of e-learning that were used during the pandemic in teaching history in secondary schools included TV, radio, computer, projectors, and social media especially WhatsApp. E-learning enabled the continuity of teaching and learning of history subject. The use of e-learning through the social media was more convenient to the teacher and the learners. It was concluded that in some secondary school in the Manzini region, history teacher and learners encountered challenges such as lack of finances to purchase e-learning gadgets and data bundles, lack of skills as well as access to the Internet. It was recommended that History teachers should create more time to offer additional learning support to students whose performance was affected by the COVID-19 pandemic effects.

Keywords: e-learning, performance, COVID-19, history, connectivism

Procedia PDF Downloads 54
8800 Machine Learning Assisted Prediction of Sintered Density of Binary W(MO) Alloys

Authors: Hexiong Liu

Abstract:

Powder metallurgy is the optimal method for the consolidation and preparation of W(Mo) alloys, which exhibit excellent application prospects at high temperatures. The properties of W(Mo) alloys are closely related to the sintered density. However, controlling the sintered density and porosity of these alloys is still challenging. In the past, the regulation methods mainly focused on time-consuming and costly trial-and-error experiments. In this study, the sintering data for more than a dozen W(Mo) alloys constituted a small-scale dataset, including both solid and liquid phases of sintering. Furthermore, simple descriptors were used to predict the sintered density of W(Mo) alloys based on the descriptor selection strategy and machine learning method (ML), where the ML algorithm included the least absolute shrinkage and selection operator (Lasso) regression, k-nearest neighbor (k-NN), random forest (RF), and multi-layer perceptron (MLP). The results showed that the interpretable descriptors extracted by our proposed selection strategy and the MLP neural network achieved a high prediction accuracy (R>0.950). By further predicting the sintered density of W(Mo) alloys using different sintering processes, the error between the predicted and experimental values was less than 0.063, confirming the application potential of the model.

Keywords: sintered density, machine learning, interpretable descriptors, W(Mo) alloy

Procedia PDF Downloads 55
8799 A Radiomics Approach to Predict the Evolution of Prostate Imaging Reporting and Data System Score 3/5 Prostate Areas in Multiparametric Magnetic Resonance

Authors: Natascha C. D'Amico, Enzo Grossi, Giovanni Valbusa, Ala Malasevschi, Gianpiero Cardone, Sergio Papa

Abstract:

Purpose: To characterize, through a radiomic approach, the nature of areas classified PI-RADS (Prostate Imaging Reporting and Data System) 3/5, recognized in multiparametric prostate magnetic resonance with T2-weighted (T2w), diffusion and perfusion sequences with paramagnetic contrast. Methods and Materials: 24 cases undergoing multiparametric prostate MR and biopsy were admitted to this pilot study. Clinical outcome of the PI-RADS 3/5 was found through biopsy, finding 8 malignant tumours. The analysed images were acquired with a Philips achieva 1.5T machine with a CE- T2-weighted sequence in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using 3DSlicer image analysis software. 45 shape-based, intensity-based and texture-based features were extracted and represented the input for preprocessing. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and testing set and select features yielding the maximal amount of information. After this pre-processing 20 input variables were selected and different machine learning systems were used to develop a predictive model based on a training testing crossover procedure. Results: The best machine learning system (three-layers feed-forward neural network) obtained a global accuracy of 90% ( 80 % sensitivity and 100% specificity ) with a ROC of 0.82. Conclusion: Machine learning systems coupled with radiomics show a promising potential in distinguishing benign from malign tumours in PI-RADS 3/5 areas.

Keywords: machine learning, MR prostate, PI-Rads 3, radiomics

Procedia PDF Downloads 167
8798 Machine Learning Invariants to Detect Anomalies in Secure Water Treatment

Authors: Jonathan Heng, Yoong Cheah Huei

Abstract:

A strategic model that does not trigger any false alarms to detect anomalies in Secure Water Treatment (SWaT) test bed is presented. This model uses machine learning invariants formulated from streamlining the general form of Auto-Regressive models with eXogenous input. A creative generalized CUSUM algorithm to integrate the invariants and the detection strategy technique is successfully developed and tested in the SWaT Programmable Logic Controllers (PLCs). Three steps to fine-tune parameters, b and τ in the generalized algorithm are stated and an example used to demonstrate the tuning process is discussed. This approach can swiftly and effectively detect various scopes of cyber-attacks such as multiple points single stage and multiple points multiple stages in SWaT. This technique can be applied in water treatment plants and other cyber physical systems like power and gas plants too.

Keywords: machine learning invariants, generalized CUSUM algorithm with invariants and detection strategy, scope of cyber attacks, strategic model, tuning parameters

Procedia PDF Downloads 160
8797 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 510
8796 Modelling the Impact of Installation of Heat Cost Allocators in District Heating Systems Using Machine Learning

Authors: Danica Maljkovic, Igor Balen, Bojana Dalbelo Basic

Abstract:

Following the regulation of EU Directive on Energy Efficiency, specifically Article 9, individual metering in district heating systems has to be introduced by the end of 2016. These directions have been implemented in member state’s legal framework, Croatia is one of these states. The directive allows installation of both heat metering devices and heat cost allocators. Mainly due to bad communication and PR, the general public false image was created that the heat cost allocators are devices that save energy. Although this notion is wrong, the aim of this work is to develop a model that would precisely express the influence of installation heat cost allocators on potential energy savings in each unit within multifamily buildings. At the same time, in recent years, a science of machine learning has gain larger application in various fields, as it is proven to give good results in cases where large amounts of data are to be processed with an aim to recognize a pattern and correlation of each of the relevant parameter as well as in the cases where the problem is too complex for a human intelligence to solve. A special method of machine learning, decision tree method, has proven an accuracy of over 92% in prediction general building consumption. In this paper, a machine learning algorithms will be used to isolate the sole impact of installation of heat cost allocators on a single building in multifamily houses connected to district heating systems. Special emphasises will be given regression analysis, logistic regression, support vector machines, decision trees and random forest method.

Keywords: district heating, heat cost allocator, energy efficiency, machine learning, decision tree model, regression analysis, logistic regression, support vector machines, decision trees and random forest method

Procedia PDF Downloads 218
8795 Finetuned Transformers for Translating Multi Dialect Texts to MSA

Authors: Tahar Alimi, Rahma Boujelbane, Wiem Derouich, Lamia Hadrich Belguith

Abstract:

Machine translation task of low-resourced languages such as Arabic is a challenging task. Despite the appearance of sophisticated models based on the latest deep learning techniques, namely the transfer learning and transformers, all models prove incapable of carrying out an acceptable translation which includes Arabic dialects because they not official status. In this paper, a machine translation model designed to translate Arabic multidialectal content into Modern Standard Arabic (MSA), leveraging both new and existing parallel resources. The latter achieved the best results for both Levantine and Maghrebi dialects with BLEU score of 64.99.

Keywords: Arabic translation, dialect translation, fine-tune, msa translation, transformer, translation

Procedia PDF Downloads 25
8794 Random Access in IoT Using Naïve Bayes Classification

Authors: Alhusein Almahjoub, Dongyu Qiu

Abstract:

This paper deals with the random access procedure in next-generation networks and presents the solution to reduce total service time (TST) which is one of the most important performance metrics in current and future internet of things (IoT) based networks. The proposed solution focuses on the calculation of optimal transmission probability which maximizes the success probability and reduces TST. It uses the information of several idle preambles in every time slot, and based on it, it estimates the number of backlogged IoT devices using Naïve Bayes estimation which is a type of supervised learning in the machine learning domain. The estimation of backlogged devices is necessary since optimal transmission probability depends on it and the eNodeB does not have information about it. The simulations are carried out in MATLAB which verify that the proposed solution gives excellent performance.

Keywords: random access, LTE/LTE-A, 5G, machine learning, Naïve Bayes estimation

Procedia PDF Downloads 126
8793 Learning to Translate by Learning to Communicate to an Entailment Classifier

Authors: Szymon Rutkowski, Tomasz Korbak

Abstract:

We present a reinforcement-learning-based method of training neural machine translation models without parallel corpora. The standard encoder-decoder approach to machine translation suffers from two problems we aim to address. First, it needs parallel corpora, which are scarce, especially for low-resource languages. Second, it lacks psychological plausibility of learning procedure: learning a foreign language is about learning to communicate useful information, not merely learning to transduce from one language’s 'encoding' to another. We instead pose the problem of learning to translate as learning a policy in a communication game between two agents: the translator and the classifier. The classifier is trained beforehand on a natural language inference task (determining the entailment relation between a premise and a hypothesis) in the target language. The translator produces a sequence of actions that correspond to generating translations of both the hypothesis and premise, which are then passed to the classifier. The translator is rewarded for classifier’s performance on determining entailment between sentences translated by the translator to disciple’s native language. Translator’s performance thus reflects its ability to communicate useful information to the classifier. In effect, we train a machine translation model without the need for parallel corpora altogether. While similar reinforcement learning formulations for zero-shot translation were proposed before, there is a number of improvements we introduce. While prior research aimed at grounding the translation task in the physical world by evaluating agents on an image captioning task, we found that using a linguistic task is more sample-efficient. Natural language inference (also known as recognizing textual entailment) captures semantic properties of sentence pairs that are poorly correlated with semantic similarity, thus enforcing basic understanding of the role played by compositionality. It has been shown that models trained recognizing textual entailment produce high-quality general-purpose sentence embeddings transferrable to other tasks. We use stanford natural language inference (SNLI) dataset as well as its analogous datasets for French (XNLI) and Polish (CDSCorpus). Textual entailment corpora can be obtained relatively easily for any language, which makes our approach more extensible to low-resource languages than traditional approaches based on parallel corpora. We evaluated a number of reinforcement learning algorithms (including policy gradients and actor-critic) to solve the problem of translator’s policy optimization and found that our attempts yield some promising improvements over previous approaches to reinforcement-learning based zero-shot machine translation.

Keywords: agent-based language learning, low-resource translation, natural language inference, neural machine translation, reinforcement learning

Procedia PDF Downloads 106
8792 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique

Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani

Abstract:

Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.

Keywords: regression, machine learning, scan radiation, robot

Procedia PDF Downloads 60
8791 Early Prediction of Disposable Addresses in Ethereum Blockchain

Authors: Ahmad Saleem

Abstract:

Ethereum is the second largest crypto currency in blockchain ecosystem. Along with standard transactions, it supports smart contracts and NFT’s. Current research trends are focused on analyzing the overall structure of the network its growth and behavior. Ethereum addresses are anonymous and can be created on fly. The nature of Ethereum network and addresses make it hard to predict their behavior. The activity period of an ethereum address is not much analyzed. Using machine learning we can make early prediction about the disposability of the address. In this paper we analyzed the lifetime of the addresses. We also identified and predicted the disposable addresses using machine learning models and compared the results.

Keywords: blockchain, Ethereum, cryptocurrency, prediction

Procedia PDF Downloads 75
8790 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 201
8789 Stackelberg Security Game for Optimizing Security of Federated Internet of Things Platform Instances

Authors: Violeta Damjanovic-Behrendt

Abstract:

This paper presents an approach for optimal cyber security decisions to protect instances of a federated Internet of Things (IoT) platform in the cloud. The presented solution implements the repeated Stackelberg Security Game (SSG) and a model called Stochastic Human behaviour model with AttRactiveness and Probability weighting (SHARP). SHARP employs the Subjective Utility Quantal Response (SUQR) for formulating a subjective utility function, which is based on the evaluations of alternative solutions during decision-making. We augment the repeated SSG (including SHARP and SUQR) with a reinforced learning algorithm called Naïve Q-Learning. Naïve Q-Learning belongs to the category of active and model-free Machine Learning (ML) techniques in which the agent (either the defender or the attacker) attempts to find an optimal security solution. In this way, we combine GT and ML algorithms for discovering optimal cyber security policies. The proposed security optimization components will be validated in a collaborative cloud platform that is based on the Industrial Internet Reference Architecture (IIRA) and its recently published security model.

Keywords: security, internet of things, cloud computing, stackelberg game, machine learning, naive q-learning

Procedia PDF Downloads 331
8788 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 64
8787 Validation of a Placebo Method with Potential for Blinding in Ultrasound-Guided Dry Needling

Authors: Johnson C. Y. Pang, Bo Peng, Kara K. L. Reeves, Allan C. L. Fud

Abstract:

Objective: Dry needling (DN) has long been used as a treatment method for various musculoskeletal pain conditions. However, the evidence level of the studies was low due to the limitations of the methodology. Lack of randomization and inappropriate blinding is potentially the main sources of bias. A method that can differentiate clinical results due to the targeted experimental procedure from its placebo effect is needed to enhance the validity of the trial. Therefore, this study aimed to validate the method as a placebo ultrasound(US)-guided DN for patients with knee osteoarthritis (KOA). Design: This is a randomized controlled trial (RCT). Ninety subjects (25 males and 65 females) aged between 51 and 80 (61.26 ± 5.57) with radiological KOA were recruited and randomly assigned into three groups with a computer program. Group 1 (G1) received real US-guided DN, Group 2 (G2) received placebo US-guided DN, and Group 3 (G3) was the control group. Both G1 and G2 subjects received the same procedure of US-guided DN, except the US monitor was turned off in G2, blinding the G2 subjects to the incorporation of faux US guidance. This arrangement created the placebo effect intended to permit comparison of their results to those who received actual US-guided DN. Outcome measures, including the visual analog scale (VAS) and Knee injury and Osteoarthritis Outcome Score (KOOS) subscales of pain, symptoms, and quality of life (QOL), were analyzed by repeated measures analysis of covariance (ANCOVA) for time effects and group effects. The data regarding the perception of receiving real US-guided DN or placebo US-guided DN were analyzed by the chi-squared test. The missing data were analyzed with the intention-to-treat (ITT) approach if more than 5% of the data were missing. Results: The placebo US-guided DN (G2) subjects had the same perceptions as the use of real US guidance in the advancement of DN (p<0.128). G1 had significantly higher pain reduction (VAS and KOOS-pain) than G2 and G3 at 8 weeks (both p<0.05) only. There was no significant difference between G2 and G3 at 8 weeks (both p>0.05). Conclusion: The method with the US monitor turned off during the application of DN is credible for blinding the participants and allowing researchers to incorporate faux US guidance. The validated placebo US-guided DN technique can aid in investigations of the effects of US-guided DN with short-term effects of pain reduction for patients with KOA. Acknowledgment: This work was supported by the Caritas Institute of Higher Education [grant number IDG200101].

Keywords: ultrasound-guided dry needling, dry needling, knee osteoarthritis, physiotheraphy

Procedia PDF Downloads 93
8786 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 25
8785 A Machine Learning Based Framework for Education Levelling in Multicultural Countries: UAE as a Case Study

Authors: Shatha Ghareeb, Rawaa Al-Jumeily, Thar Baker

Abstract:

In Abu Dhabi, there are many different education curriculums where sector of private schools and quality assurance is supervising many private schools in Abu Dhabi for many nationalities. As there are many different education curriculums in Abu Dhabi to meet expats’ needs, there are different requirements for registration and success. In addition, there are different age groups for starting education in each curriculum. In fact, each curriculum has a different number of years, assessment techniques, reassessment rules, and exam boards. Currently, students that transfer curriculums are not being placed in the right year group due to different start and end dates of each academic year and their date of birth for each year group is different for each curriculum and as a result, we find students that are either younger or older for that year group which therefore creates gaps in their learning and performance. In addition, there is not a way of storing student data throughout their academic journey so that schools can track the student learning process. In this paper, we propose to develop a computational framework applicable in multicultural countries such as UAE in which multi-education systems are implemented. The ultimate goal is to use cloud and fog computing technology integrated with Artificial Intelligence techniques of Machine Learning to aid in a smooth transition when assigning students to their year groups, and provide leveling and differentiation information of students who relocate from a particular education curriculum to another, whilst also having the ability to store and access student data from anywhere throughout their academic journey.

Keywords: admissions, algorithms, cloud computing, differentiation, fog computing, levelling, machine learning

Procedia PDF Downloads 112
8784 Data-Driven Decision Making: A Reference Model for Organizational, Educational and Competency-Based Learning Systems

Authors: Emanuel Koseos

Abstract:

Data-Driven Decision Making (DDDM) refers to making decisions that are based on historical data in order to inform practice, develop strategies and implement policies that benefit organizational settings. In educational technology, DDDM facilitates the implementation of differential educational learning approaches such as Educational Data Mining (EDM) and Competency-Based Education (CBE), which commonly target university classrooms. There is a current need for DDDM models applied to middle and secondary schools from a concern for assessing the needs, progress and performance of students and educators with respect to regional standards, policies and evolution of curriculums. To address these concerns, we propose a DDDM reference model developed using educational key process initiatives as inputs to a machine learning framework implemented with statistical software (SAS, R) to provide a best-practices, complex-free and automated approach for educators at their regional level. We assessed the efficiency of the model over a six-year period using data from 45 schools and grades K-12 in the Langley, BC, Canada regional school district. We concluded that the model has wider appeal, such as business learning systems.

Keywords: competency-based learning, data-driven decision making, machine learning, secondary schools

Procedia PDF Downloads 150
8783 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 280
8782 An Assessment of Floodplain Vegetation Response to Groundwater Changes Using the Soil & Water Assessment Tool Hydrological Model, Geographic Information System, and Machine Learning in the Southeast Australian River Basin

Authors: Newton Muhury, Armando A. Apan, Tek N. Marasani, Gebiaw T. Ayele

Abstract:

The changing climate has degraded freshwater availability in Australia that influencing vegetation growth to a great extent. This study assessed the vegetation responses to groundwater using Terra’s moderate resolution imaging spectroradiometer (MODIS), Normalised Difference Vegetation Index (NDVI), and soil water content (SWC). A hydrological model, SWAT, has been set up in a southeast Australian river catchment for groundwater analysis. The model was calibrated and validated against monthly streamflow from 2001 to 2006 and 2007 to 2010, respectively. The SWAT simulated soil water content for 43 sub-basins and monthly MODIS NDVI data for three different types of vegetation (forest, shrub, and grass) were applied in the machine learning tool, Waikato Environment for Knowledge Analysis (WEKA), using two supervised machine learning algorithms, i.e., support vector machine (SVM) and random forest (RF). The assessment shows that different types of vegetation response and soil water content vary in the dry and wet seasons. The WEKA model generated high positive relationships (r = 0.76, 0.73, and 0.81) between NDVI values of all vegetation in the sub-basins against soil water content (SWC), the groundwater flow (GW), and the combination of these two variables, respectively, during the dry season. However, these responses were reduced by 36.8% (r = 0.48) and 13.6% (r = 0.63) against GW and SWC, respectively, in the wet season. Although the rainfall pattern is highly variable in the study area, the summer rainfall is very effective for the growth of the grass vegetation type. This study has enriched our knowledge of vegetation responses to groundwater in each season, which will facilitate better floodplain vegetation management.

Keywords: ArcSWAT, machine learning, floodplain vegetation, MODIS NDVI, groundwater

Procedia PDF Downloads 75
8781 Application of Machine Learning Models to Predict Couchsurfers on Free Homestay Platform Couchsurfing

Authors: Yuanxiang Miao

Abstract:

Couchsurfing is a free homestay and social networking service accessible via the website and mobile app. Couchsurfers can directly request free accommodations from others and receive offers from each other. However, it is typically difficult for people to make a decision that accepts or declines a request when they receive it from Couchsurfers because they do not know each other at all. People are expected to meet up with some Couchsurfers who are kind, generous, and interesting while it is unavoidable to meet up with someone unfriendly. This paper utilized classification algorithms of Machine Learning to help people to find out the Good Couchsurfers and Not Good Couchsurfers on the Couchsurfing website. By knowing the prior experience, like Couchsurfer’s profiles, the latest references, and other factors, it became possible to recognize what kind of the Couchsurfers, and furthermore, it helps people to make a decision that whether to host the Couchsurfers or not. The value of this research lies in a case study in Kyoto, Japan in where the author has hosted 54 Couchsurfers, and the author collected relevant data from the 54 Couchsurfers, finally build a model based on classification algorithms for people to predict Couchsurfers. Lastly, the author offered some feasible suggestions for future research.

Keywords: Couchsurfing, Couchsurfers prediction, classification algorithm, hospitality tourism platform, hospitality sciences, machine learning

Procedia PDF Downloads 103
8780 Machine Learning Assisted Performance Optimization in Memory Tiering

Authors: Derssie Mebratu

Abstract:

As a large variety of micro services, web services, social graphic applications, and media applications are continuously developed, it is substantially vital to design and build a reliable, efficient, and faster memory tiering system. Despite limited design, implementation, and deployment in the last few years, several techniques are currently developed to improve a memory tiering system in a cloud. Some of these techniques are to develop an optimal scanning frequency; improve and track pages movement; identify pages that recently accessed; store pages across each tiering, and then identify pages as a hot, warm, and cold so that hot pages can store in the first tiering Dynamic Random Access Memory (DRAM) and warm pages store in the second tiering Compute Express Link(CXL) and cold pages store in the third tiering Non-Volatile Memory (NVM). Apart from the current proposal and implementation, we also develop a new technique based on a machine learning algorithm in that the throughput produced 25% improved performance compared to the performance produced by the baseline as well as the latency produced 95% improved performance compared to the performance produced by the baseline.

Keywords: machine learning, bayesian optimization, memory tiering, CXL, DRAM

Procedia PDF Downloads 74
8779 Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

Keywords: clustering, load profiling, load modeling, machine learning, energy efficiency and quality

Procedia PDF Downloads 140