Search results for: decision-making algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2039

Search results for: decision-making algorithms

1619 A Data Science Pipeline for Algorithmic Trading: A Comparative Study in Applications to Finance and Cryptoeconomics

Authors: Luyao Zhang, Tianyu Wu, Jiayi Li, Carlos-Gustavo Salas-Flores, Saad Lahrichi

Abstract:

Recent advances in AI have made algorithmic trading a central role in finance. However, current research and applications are disconnected information islands. We propose a generally applicable pipeline for designing, programming, and evaluating algorithmic trading of stock and crypto tokens. Moreover, we provide comparative case studies for four conventional algorithms, including moving average crossover, volume-weighted average price, sentiment analysis, and statistical arbitrage. Our study offers a systematic way to program and compare different trading strategies. Moreover, we implement our algorithms by object-oriented programming in Python3, which serves as open-source software for future academic research and applications.

Keywords: algorithmic trading, AI for finance, fintech, machine learning, moving average crossover, volume weighted average price, sentiment analysis, statistical arbitrage, pair trading, object-oriented programming, python3

Procedia PDF Downloads 146
1618 A Generalized Space-Efficient Algorithm for Quantum Bit String Comparators

Authors: Khuram Shahzad, Omar Usman Khan

Abstract:

Quantum bit string comparators (QBSC) operate on two sequences of n-qubits, enabling the determination of their relationships, such as equality, greater than, or less than. This is analogous to the way conditional statements are used in programming languages. Consequently, QBSCs play a crucial role in various algorithms that can be executed or adapted for quantum computers. The development of efficient and generalized comparators for any n-qubit length has long posed a challenge, as they have a high-cost footprint and lead to quantum delays. Comparators that are efficient are associated with inputs of fixed length. As a result, comparators without a generalized circuit cannot be employed at a higher level, though they are well-suited for problems with limited size requirements. In this paper, we introduce a generalized design for the comparison of two n-qubit logic states using just two ancillary bits. The design is examined on the basis of qubit requirements, ancillary bit usage, quantum cost, quantum delay, gate operations, and circuit complexity and is tested comprehensively on various input lengths. The work allows for sufficient flexibility in the design of quantum algorithms, which can accelerate quantum algorithm development.

Keywords: quantum comparator, quantum algorithm, space-efficient comparator, comparator

Procedia PDF Downloads 15
1617 Preliminary Study of Hand Gesture Classification in Upper-Limb Prosthetics Using Machine Learning with EMG Signals

Authors: Linghui Meng, James Atlas, Deborah Munro

Abstract:

There is an increasing demand for prosthetics capable of mimicking natural limb movements and hand gestures, but precise movement control of prosthetics using only electrode signals continues to be challenging. This study considers the implementation of machine learning as a means of improving accuracy and presents an initial investigation into hand gesture recognition using models based on electromyographic (EMG) signals. EMG signals, which capture muscle activity, are used as inputs to machine learning algorithms to improve prosthetic control accuracy, functionality and adaptivity. Using logistic regression, a machine learning classifier, this study evaluates the accuracy of classifying two hand gestures from the publicly available Ninapro dataset using two-time series feature extraction algorithms: Time Series Feature Extraction (TSFE) and Convolutional Neural Networks (CNNs). Trials were conducted using varying numbers of EMG channels from one to eight to determine the impact of channel quantity on classification accuracy. The results suggest that although both algorithms can successfully distinguish between hand gesture EMG signals, CNNs outperform TSFE in extracting useful information for both accuracy and computational efficiency. In addition, although more channels of EMG signals provide more useful information, they also require more complex and computationally intensive feature extractors and consequently do not perform as well as lower numbers of channels. The findings also underscore the potential of machine learning techniques in developing more effective and adaptive prosthetic control systems.

Keywords: EMG, machine learning, prosthetic control, electromyographic prosthetics, hand gesture classification, CNN, computational neural networks, TSFE, time series feature extraction, channel count, logistic regression, ninapro, classifiers

Procedia PDF Downloads 31
1616 A Numerical Description of a Fibre Reinforced Concrete Using a Genetic Algorithm

Authors: Henrik L. Funke, Lars Ulke-Winter, Sandra Gelbrich, Lothar Kroll

Abstract:

This work reports about an approach for an automatic adaptation of concrete formulations based on genetic algorithms (GA) to optimize a wide range of different fit-functions. In order to achieve the goal, a method was developed which provides a numerical description of a fibre reinforced concrete (FRC) mixture regarding the production technology and the property spectrum of the concrete. In a first step, the FRC mixture with seven fixed components was characterized by varying amounts of the components. For that purpose, ten concrete mixtures were prepared and tested. The testing procedure comprised flow spread, compressive and bending tensile strength. The analysis and approximation of the determined data was carried out by GAs. The aim was to obtain a closed mathematical expression which best describes the given seven-point cloud of FRC by applying a Gene Expression Programming with Free Coefficients (GEP-FC) strategy. The seven-parametric FRC-mixtures model which is generated according to this method correlated well with the measured data. The developed procedure can be used for concrete mixtures finding closed mathematical expressions, which are based on the measured data.

Keywords: concrete design, fibre reinforced concrete, genetic algorithms, GEP-FC

Procedia PDF Downloads 280
1615 Modified 'Perturb and Observe' with 'Incremental Conductance' Algorithm for Maximum Power Point Tracking

Authors: H. Fuad Usman, M. Rafay Khan Sial, Shahzaib Hamid

Abstract:

The trend of renewable energy resources has been amplified due to global warming and other environmental related complications in the 21st century. Recent research has very much emphasized on the generation of electrical power through renewable resources like solar, wind, hydro, geothermal, etc. The use of the photovoltaic cell has become very public as it is very useful for the domestic and commercial purpose overall the world. Although a single cell gives the low voltage output but connecting a number of cells in a series formed a complete module of the photovoltaic cells, it is becoming a financial investment as the use of it fetching popular. This also reduced the prices of the photovoltaic cell which gives the customers a confident of using this source for their electrical use. Photovoltaic cell gives the MPPT at single specific point of operation at a given temperature and level of solar intensity received at a given surface whereas the focal point changes over a large range depending upon the manufacturing factor, temperature conditions, intensity for insolation, instantaneous conditions for shading and aging factor for the photovoltaic cells. Two improved algorithms have been proposed in this article for the MPPT. The widely used algorithms are the ‘Incremental Conductance’ and ‘Perturb and Observe’ algorithms. To extract the maximum power from the source to the load, the duty cycle of the convertor will be effectively controlled. After assessing the previous techniques, this paper presents the improved and reformed idea of harvesting maximum power point from the photovoltaic cells. A thoroughly go through of the previous ideas has been observed before constructing the improvement in the traditional technique of MPP. Each technique has its own importance and boundaries at various weather conditions. An improved technique of implementing the use of both ‘Perturb and Observe’ and ‘Incremental Conductance’ is introduced.

Keywords: duty cycle, MPPT (Maximum Power Point Tracking), perturb and observe (P&O), photovoltaic module

Procedia PDF Downloads 176
1614 Blind Watermarking Using Discrete Wavelet Transform Algorithm with Patchwork

Authors: Toni Maristela C. Estabillo, Michaela V. Matienzo, Mikaela L. Sabangan, Rosette M. Tienzo, Justine L. Bahinting

Abstract:

This study is about blind watermarking on images with different categories and properties using two algorithms namely, Discrete Wavelet Transform and Patchwork Algorithm. A program is created to perform watermark embedding, extraction and evaluation. The evaluation is based on three watermarking criteria namely: image quality degradation, perceptual transparency and security. Image quality is measured by comparing the original properties with the processed one. Perceptual transparency is measured by a visual inspection on a survey. Security is measured by implementing geometrical and non-geometrical attacks through a pass or fail testing. Values used to measure the following criteria are mostly based on Mean Squared Error (MSE) and Peak Signal to Noise Ratio (PSNR). The results are based on statistical methods used to interpret and collect data such as averaging, z Test and survey. The study concluded that the combined DWT and Patchwork algorithms were less efficient and less capable of watermarking than DWT algorithm only.

Keywords: blind watermarking, discrete wavelet transform algorithm, patchwork algorithm, digital watermark

Procedia PDF Downloads 268
1613 The Best Prediction Data Mining Model for Breast Cancer Probability in Women Residents in Kabul

Authors: Mina Jafari, Kobra Hamraee, Saied Hossein Hosseini

Abstract:

The prediction of breast cancer disease is one of the challenges in medicine. In this paper we collected 528 records of women’s information who live in Kabul including demographic, life style, diet and pregnancy data. There are many classification algorithm in breast cancer prediction and tried to find the best model with most accurate result and lowest error rate. We evaluated some other common supervised algorithms in data mining to find the best model in prediction of breast cancer disease among afghan women living in Kabul regarding to momography result as target variable. For evaluating these algorithms we used Cross Validation which is an assured method for measuring the performance of models. After comparing error rate and accuracy of three models: Decision Tree, Naive Bays and Rule Induction, Decision Tree with accuracy of 94.06% and error rate of %15 is found the best model to predicting breast cancer disease based on the health care records.

Keywords: decision tree, breast cancer, probability, data mining

Procedia PDF Downloads 138
1612 DOA Estimation Using Golden Section Search

Authors: Niharika Verma, Sandeep Santosh

Abstract:

DOA technique is a localization technique used in the communication field. Various algorithms have been developed for direction of arrival estimation like MUSIC, ROOT MUSIC, etc. These algorithms depend on various parameters like antenna array elements, number of snapshots and various others. Basically the MUSIC spectrum is evaluated and peaks obtained are considered as the angle of arrivals. The angles evaluated using this process depends on the scanning interval chosen. The accuracy of the results obtained depends on the coarseness of the interval chosen. In this paper, golden section search is applied to the MUSIC algorithm and therefore, more accurate results are achieved. Initially the coarse DOA estimations is done using the MUSIC algorithm in the range -90 to 90 degree at the interval of 10 degree. After the peaks obtained then fine DOA estimation is done using golden section search. Also, the partitioning method is applied to estimate the number of signals incident on the antenna array. Dependency of the algorithm on the number of snapshots is also being explained. Hence, the accurate results are being determined using this algorithm.

Keywords: Direction of Arrival (DOA), golden section search, MUSIC, number of snapshots

Procedia PDF Downloads 446
1611 Conduction Transfer Functions for the Calculation of Heat Demands in Heavyweight Facade Systems

Authors: Mergim Gasia, Bojan Milovanovica, Sanjin Gumbarevic

Abstract:

Better energy performance of the building envelope is one of the most important aspects of energy savings if the goals set by the European Union are to be achieved in the future. Dynamic heat transfer simulations are being used for the calculation of building energy consumption because they give more realistic energy demands compared to the stationary calculations that do not take the building’s thermal mass into account. Software used for these dynamic simulation use methods that are based on the analytical models since numerical models are insufficient for longer periods. The analytical models used in this research fall in the category of the conduction transfer functions (CTFs). Two methods for calculating the CTFs covered by this research are the Laplace method and the State-Space method. The literature review showed that the main disadvantage of these methods is that they are inadequate for heavyweight façade elements and shorter time periods used for the calculation. The algorithms for both the Laplace and State-Space methods are implemented in Mathematica, and the results are compared to the results from EnergyPlus and TRNSYS since these software use similar algorithms for the calculation of the building’s energy demand. This research aims to check the efficiency of the Laplace and the State-Space method for calculating the building’s energy demand for heavyweight building elements and shorter sampling time, and it also gives the means for the improvement of the algorithms used by these methods. As the reference point for the boundary heat flux density, the finite difference method (FDM) is used. Even though the dynamic heat transfer simulations are superior to the calculation based on the stationary boundary conditions, they have their limitations and will give unsatisfactory results if not properly used.

Keywords: Laplace method, state-space method, conduction transfer functions, finite difference method

Procedia PDF Downloads 133
1610 A Comparative Analysis of Classification Models with Wrapper-Based Feature Selection for Predicting Student Academic Performance

Authors: Abdullah Al Farwan, Ya Zhang

Abstract:

In today’s educational arena, it is critical to understand educational data and be able to evaluate important aspects, particularly data on student achievement. Educational Data Mining (EDM) is a research area that focusing on uncovering patterns and information in data from educational institutions. Teachers, if they are able to predict their students' class performance, can use this information to improve their teaching abilities. It has evolved into valuable knowledge that can be used for a wide range of objectives; for example, a strategic plan can be used to generate high-quality education. Based on previous data, this paper recommends employing data mining techniques to forecast students' final grades. In this study, five data mining methods, Decision Tree, JRip, Naive Bayes, Multi-layer Perceptron, and Random Forest with wrapper feature selection, were used on two datasets relating to Portuguese language and mathematics classes lessons. The results showed the effectiveness of using data mining learning methodologies in predicting student academic success. The classification accuracy achieved with selected algorithms lies in the range of 80-94%. Among all the selected classification algorithms, the lowest accuracy is achieved by the Multi-layer Perceptron algorithm, which is close to 70.45%, and the highest accuracy is achieved by the Random Forest algorithm, which is close to 94.10%. This proposed work can assist educational administrators to identify poor performing students at an early stage and perhaps implement motivational interventions to improve their academic success and prevent educational dropout.

Keywords: classification algorithms, decision tree, feature selection, multi-layer perceptron, Naïve Bayes, random forest, students’ academic performance

Procedia PDF Downloads 166
1609 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 377
1608 MPPT Control with (P&O) and (FLC) Algorithms of Solar Electric Generator

Authors: Dib Djalel, Mordjaoui Mourad

Abstract:

The current trend towards the exploitation of various renewable energy resources has become indispensable, so it is important to improve the efficiency and reliability of the GPV photovoltaic systems. Maximum Power Point Tracking (MPPT) plays an important role in photovoltaic power systems because it maximize the power output from a PV system for a given set of conditions. This paper presents a new fuzzy logic control based MPPT algorithm for solar panel. The solar panel is modeled and analyzed in Matlab/Simulink. The Solar panel can produce maximum power at a particular operating point called Maximum Power Point(MPP). To produce maximum power and to get maximum efficiency, the entire photovoltaic panel must operate at this particular point. Maximum power point of PV panel keeps on changing with changing environmental conditions such as solar irradiance and cell temperature. Thus, to extract maximum available power from a PV module, MPPT algorithms are implemented and Perturb and Observe (P&O) MPPT and fuzzy logic control FLC, MPPT are developed and compared. Simulation results show the effectiveness of the fuzzy control technique to produce a more stable power.

Keywords: MPPT, photovoltaic panel, fuzzy logic control, modeling, solar power

Procedia PDF Downloads 483
1607 Harnessing Artificial Intelligence and Machine Learning for Advanced Fraud Detection and Prevention

Authors: Avinash Malladhi

Abstract:

Forensic accounting is a specialized field that involves the application of accounting principles, investigative skills, and legal knowledge to detect and prevent fraud. With the rise of big data and technological advancements, artificial intelligence (AI) and machine learning (ML) algorithms have emerged as powerful tools for forensic accountants to enhance their fraud detection capabilities. In this paper, we review and analyze various AI/ML algorithms that are commonly used in forensic accounting, including supervised and unsupervised learning, deep learning, natural language processing Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs), Decision Trees, and Random Forests. We discuss their underlying principles, strengths, and limitations and provide empirical evidence from existing research studies demonstrating their effectiveness in detecting financial fraud. We also highlight potential ethical considerations and challenges associated with using AI/ML in forensic accounting. Furthermore, we highlight the benefits of these technologies in improving fraud detection and prevention in forensic accounting.

Keywords: AI, machine learning, forensic accounting & fraud detection, anti money laundering, Benford's law, fraud triangle theory

Procedia PDF Downloads 93
1606 Fuzzy-Machine Learning Models for the Prediction of Fire Outbreak: A Comparative Analysis

Authors: Uduak Umoh, Imo Eyoh, Emmauel Nyoho

Abstract:

This paper compares fuzzy-machine learning algorithms such as Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) for the predicting cases of fire outbreak. The paper uses the fire outbreak dataset with three features (Temperature, Smoke, and Flame). The data is pre-processed using Interval Type-2 Fuzzy Logic (IT2FL) algorithm. Min-Max Normalization and Principal Component Analysis (PCA) are used to predict feature labels in the dataset, normalize the dataset, and select relevant features respectively. The output of the pre-processing is a dataset with two principal components (PC1 and PC2). The pre-processed dataset is then used in the training of the aforementioned machine learning models. K-fold (with K=10) cross-validation method is used to evaluate the performance of the models using the matrices – ROC (Receiver Operating Curve), Specificity, and Sensitivity. The model is also tested with 20% of the dataset. The validation result shows KNN is the better model for fire outbreak detection with an ROC value of 0.99878, followed by SVM with an ROC value of 0.99753.

Keywords: Machine Learning Algorithms , Interval Type-2 Fuzzy Logic, Fire Outbreak, Support Vector Machine, K-Nearest Neighbour, Principal Component Analysis

Procedia PDF Downloads 182
1605 Adapting the Chemical Reaction Optimization Algorithm to the Printed Circuit Board Drilling Problem

Authors: Taisir Eldos, Aws Kanan, Waleed Nazih, Ahmad Khatatbih

Abstract:

Chemical Reaction Optimization (CRO) is an optimization metaheuristic inspired by the nature of chemical reactions as a natural process of transforming the substances from unstable to stable states. Starting with some unstable molecules with excessive energy, a sequence of interactions takes the set to a state of minimum energy. Researchers reported successful application of the algorithm in solving some engineering problems, like the quadratic assignment problem, with superior performance when compared with other optimization algorithms. We adapted this optimization algorithm to the Printed Circuit Board Drilling Problem (PCBDP) towards reducing the drilling time and hence improving the PCB manufacturing throughput. Although the PCBDP can be viewed as instance of the popular Traveling Salesman Problem (TSP), it has some characteristics that would require special attention to the transactions that explore the solution landscape. Experimental test results using the standard CROToolBox are not promising for practically sized problems, while it could find optimal solutions for artificial problems and small benchmarks as a proof of concept.

Keywords: evolutionary algorithms, chemical reaction optimization, traveling salesman, board drilling

Procedia PDF Downloads 519
1604 Development and Investigation of Efficient Substrate Feeding and Dissolved Oxygen Control Algorithms for Scale-Up of Recombinant E. coli Cultivation Process

Authors: Vytautas Galvanauskas, Rimvydas Simutis, Donatas Levisauskas, Vykantas Grincas, Renaldas Urniezius

Abstract:

The paper deals with model-based development and implementation of efficient control strategies for recombinant protein synthesis in fed-batch E.coli cultivation processes. Based on experimental data, a kinetic dynamic model for cultivation process was developed. This model was used to determine substrate feeding strategies during the cultivation. The proposed feeding strategy consists of two phases – biomass growth phase and recombinant protein production phase. In the first process phase, substrate-limited process is recommended when the specific growth rate of biomass is about 90-95% of its maximum value. This ensures reduction of glucose concentration in the medium, improves process repeatability, reduces the development of secondary metabolites and other unwanted by-products. The substrate limitation can be enhanced to satisfy restriction on maximum oxygen transfer rate in the bioreactor and to guarantee necessary dissolved carbon dioxide concentration in culture media. In the recombinant protein production phase, the level of substrate limitation and specific growth rate are selected within the range to enable optimal target protein synthesis rate. To account for complex process dynamics, to efficiently exploit the oxygen transfer capability of the bioreactor, and to maintain the required dissolved oxygen concentration, adaptive control algorithms for dissolved oxygen control have been proposed. The developed model-based control strategies are useful in scale-up of cultivation processes and accelerate implementation of innovative biotechnological processes for industrial applications.

Keywords: adaptive algorithms, model-based control, recombinant E. coli, scale-up of bioprocesses

Procedia PDF Downloads 257
1603 A Comparison of Methods for Neural Network Aggregation

Authors: John Pomerat, Aviv Segev

Abstract:

Recently, deep learning has had many theoretical breakthroughs. For deep learning to be successful in the industry, however, there need to be practical algorithms capable of handling many real-world hiccups preventing the immediate application of a learning algorithm. Although AI promises to revolutionize the healthcare industry, getting access to patient data in order to train learning algorithms has not been easy. One proposed solution to this is data- sharing. In this paper, we propose an alternative protocol, based on multi-party computation, to train deep learning models while maintaining both the privacy and security of training data. We examine three methods of training neural networks in this way: Transfer learning, average ensemble learning, and series network learning. We compare these methods to the equivalent model obtained through data-sharing across two different experiments. Additionally, we address the security concerns of this protocol. While the motivating example is healthcare, our findings regarding multi-party computation of neural network training are purely theoretical and have use-cases outside the domain of healthcare.

Keywords: neural network aggregation, multi-party computation, transfer learning, average ensemble learning

Procedia PDF Downloads 162
1602 Reducing the Computational Overhead of Metaheuristics Parameterization with Exploratory Landscape Analysis

Authors: Iannick Gagnon, Alain April

Abstract:

The performance of a metaheuristic on a given problem class depends on the class itself and the choice of parameters. Parameter tuning is the most time-consuming phase of the optimization process after the main calculations and it often nullifies the speed advantage of metaheuristics over traditional optimization algorithms. Several off-the-shelf parameter tuning algorithms are available, but when the objective function is expensive to evaluate, these can be prohibitively expensive to use. This paper presents a surrogate-like method for finding adequate parameters using fitness landscape analysis on simple benchmark functions and real-world objective functions. The result is a simple compound similarity metric based on the empirical correlation coefficient and a measure of convexity. It is then used to find the best benchmark functions to serve as surrogates. The near-optimal parameter set is then found using fractional factorial design. The real-world problem of NACA airfoil lift coefficient maximization is used as a preliminary proof of concept. The overall aim of this research is to reduce the computational overhead of metaheuristics parameterization.

Keywords: metaheuristics, stochastic optimization, particle swarm optimization, exploratory landscape analysis

Procedia PDF Downloads 153
1601 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction

Procedia PDF Downloads 263
1600 Assessment of DNA Sequence Encoding Techniques for Machine Learning Algorithms Using a Universal Bacterial Marker

Authors: Diego Santibañez Oyarce, Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán

Abstract:

The advent of high-throughput sequencing technologies has revolutionized genomics, generating vast amounts of genetic data that challenge traditional bioinformatics methods. Machine learning addresses these challenges by leveraging computational power to identify patterns and extract information from large datasets. However, biological sequence data, being symbolic and non-numeric, must be converted into numerical formats for machine learning algorithms to process effectively. So far, some encoding methods, such as one-hot encoding or k-mers, have been explored. This work proposes additional approaches for encoding DNA sequences in order to compare them with existing techniques and determine if they can provide improvements or if current methods offer superior results. Data from the 16S rRNA gene, a universal marker, was used to analyze eight bacterial groups that are significant in the pulmonary environment and have clinical implications. The bacterial genes included in this analysis are Prevotella, Abiotrophia, Acidovorax, Streptococcus, Neisseria, Veillonella, Mycobacterium, and Megasphaera. These data were downloaded from the NCBI database in Genbank file format, followed by a syntactic analysis to selectively extract relevant information from each file. For data encoding, a sequence normalization process was carried out as the first step. From approximately 22,000 initial data points, a subset was generated for testing purposes. Specifically, 55 sequences from each bacterial group met the length criteria, resulting in an initial sample of approximately 440 sequences. The sequences were encoded using different methods, including one-hot encoding, k-mers, Fourier transform, and Wavelet transform. Various machine learning algorithms, such as support vector machines, random forests, and neural networks, were trained to evaluate these encoding methods. The performance of these models was assessed using multiple metrics, including the confusion matrix, ROC curve, and F1 Score, providing a comprehensive evaluation of their classification capabilities. The results show that accuracies between encoding methods vary by up to approximately 15%, with the Fourier transform obtaining the best results for the evaluated machine learning algorithms. These findings, supported by the detailed analysis using the confusion matrix, ROC curve, and F1 Score, provide valuable insights into the effectiveness of different encoding methods and machine learning algorithms for genomic data analysis, potentially improving the accuracy and efficiency of bacterial classification and related genomic studies.

Keywords: DNA encoding, machine learning, Fourier transform, Fourier transformation

Procedia PDF Downloads 23
1599 Improvement of Cross Range Resolution in Through Wall Radar Imaging Using Bilateral Backprojection

Authors: Rashmi Yadawad, Disha Narayanan, Ravi Gautam

Abstract:

Through Wall Radar Imaging is gaining increasing importance now a days in the field of Defense and one of the most important criteria that forms the basis for the image quality obtained is the Cross-Range resolution of the image. In this research paper, the Bilateral Back projection algorithm has been implemented for Through Wall Radar Imaging. The sole purpose is to enhance the resolution in the cross range direction of the obtained Back projection image. Synthetic Data is generated for two targets which are placed at various locations in a room of dimensions 8 m by 6m. Two algorithms namely, simple back projection and Bilateral Back projection have been implemented, images are obtained and the obtained images are compared. Numerical simulations have been coded in MATLAB and experimental results of the two algorithms have been shown. Based on the comparison between the two images, it can be clearly seen that the ringing effect and chess board effect have been heavily reduced in the bilaterally back projected image and hence promising results are obtained giving a relatively sharper image with relatively well defined edges.

Keywords: through wall radar imaging, bilateral back projection, cross range resolution, synthetic data

Procedia PDF Downloads 347
1598 An Adaptive Hybrid Surrogate-Assisted Particle Swarm Optimization Algorithm for Expensive Structural Optimization

Authors: Xiongxiong You, Zhanwen Niu

Abstract:

Choosing an appropriate surrogate model plays an important role in surrogates-assisted evolutionary algorithms (SAEAs) since there are many types and different kernel functions in the surrogate model. In this paper, an adaptive selection of the best suitable surrogate model method is proposed to solve different kinds of expensive optimization problems. Firstly, according to the prediction residual error sum of square (PRESS) and different model selection strategies, the excellent individual surrogate models are integrated into multiple ensemble models in each generation. Then, based on the minimum root of mean square error (RMSE), the best suitable surrogate model is selected dynamically. Secondly, two methods with dynamic number of models and selection strategies are designed, which are used to show the influence of the number of individual models and selection strategy. Finally, some compared studies are made to deal with several commonly used benchmark problems, as well as a rotor system optimization problem. The results demonstrate the accuracy and robustness of the proposed method.

Keywords: adaptive selection, expensive optimization, rotor system, surrogates assisted evolutionary algorithms

Procedia PDF Downloads 141
1597 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 56
1596 Predictive Maintenance of Electrical Induction Motors Using Machine Learning

Authors: Muhammad Bilal, Adil Ahmed

Abstract:

This study proposes an approach for electrical induction motor predictive maintenance utilizing machine learning algorithms. On the basis of a study of temperature data obtained from sensors put on the motor, the goal is to predict motor failures. The proposed models are trained to identify whether a motor is defective or not by utilizing machine learning algorithms like Support Vector Machines (SVM) and K-Nearest Neighbors (KNN). According to a thorough study of the literature, earlier research has used motor current signature analysis (MCSA) and vibration data to forecast motor failures. The temperature signal methodology, which has clear advantages over the conventional MCSA and vibration analysis methods in terms of cost-effectiveness, is the main subject of this research. The acquired results emphasize the applicability and effectiveness of the temperature-based predictive maintenance strategy by demonstrating the successful categorization of defective motors using the suggested machine learning models.

Keywords: predictive maintenance, electrical induction motors, machine learning, temperature signal methodology, motor failures

Procedia PDF Downloads 117
1595 Elemental Graph Data Model: A Semantic and Topological Representation of Building Elements

Authors: Yasmeen A. S. Essawy, Khaled Nassar

Abstract:

With the rapid increase of complexity in the building industry, professionals in the A/E/C industry were forced to adopt Building Information Modeling (BIM) in order to enhance the communication between the different project stakeholders throughout the project life cycle and create a semantic object-oriented building model that can support geometric-topological analysis of building elements during design and construction. This paper presents a model that extracts topological relationships and geometrical properties of building elements from an existing fully designed BIM, and maps this information into a directed acyclic Elemental Graph Data Model (EGDM). The model incorporates BIM-based search algorithms for automatic deduction of geometrical data and topological relationships for each building element type. Using graph search algorithms, such as Depth First Search (DFS) and topological sortings, all possible construction sequences can be generated and compared against production and construction rules to generate an optimized construction sequence and its associated schedule. The model is implemented in a C# platform.

Keywords: building information modeling (BIM), elemental graph data model (EGDM), geometric and topological data models, graph theory

Procedia PDF Downloads 382
1594 Agile Software Effort Estimation Using Regression Techniques

Authors: Mikiyas Adugna

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 71
1593 Review of Hydrologic Applications of Conceptual Models for Precipitation-Runoff Process

Authors: Oluwatosin Olofintoye, Josiah Adeyemo, Gbemileke Shomade

Abstract:

The relationship between rainfall and runoff is an important issue in surface water hydrology therefore the understanding and development of accurate rainfall-runoff models and their applications in water resources planning, management and operation are of paramount importance in hydrological studies. This paper reviews some of the previous works on the rainfall-runoff process modeling. The hydrologic applications of conceptual models and artificial neural networks (ANNs) for the precipitation-runoff process modeling were studied. Gradient training methods such as error back-propagation (BP) and evolutionary algorithms (EAs) are discussed in relation to the training of artificial neural networks and it is shown that application of EAs to artificial neural networks training could be an alternative to other training methods. Therefore, further research interest to exploit the abundant expert knowledge in the area of artificial intelligence for the solution of hydrologic and water resources planning and management problems is needed.

Keywords: artificial intelligence, artificial neural networks, evolutionary algorithms, gradient training method, rainfall-runoff model

Procedia PDF Downloads 454
1592 A Survey on Intelligent Traffic Management with Cooperative Driving in Urban Roads

Authors: B. Karabuluter, O. Karaduman

Abstract:

Traffic management and traffic planning are important issues, especially in big cities. Due to the increase of personal vehicles and the physical constraints of urban roads, the problem of transportation especially in crowded cities over time is revealed. This situation reduces the living standards, and it can put human life at risk because the vehicles such as ambulance, fire department are prevented from reaching their targets. Even if the city planners take these problems into account, emergency planning and traffic management are needed to avoid cases such as traffic congestion, intersections, traffic jams caused by traffic accidents or roadworks. In this study, in smart traffic management issues, proposed solutions using intelligent vehicles acting in cooperation with urban roads are examined. Traffic management is becoming more difficult due to factors such as fatigue, carelessness, sleeplessness, social behavior patterns, and lack of education. However, autonomous vehicles, which remove the problems caused by human weaknesses by providing driving control, are increasing the success of practicing the algorithms developed in city traffic management. Such intelligent vehicles have become an important solution in urban life by using 'swarm intelligence' algorithms and cooperative driving methods to provide traffic flow, prevent traffic accidents, and increase living standards. In this study, studies conducted in this area have been dealt with in terms of traffic jam, intersections, regulation of traffic flow, signaling, prevention of traffic accidents, cooperation and communication techniques of vehicles, fleet management, transportation of emergency vehicles. From these concepts, some taxonomies were made out of the way. This work helps to develop new solutions and algorithms for cities where intelligent vehicles that can perform cooperative driving can take place, and at the same time emphasize the trend in this area.

Keywords: intelligent traffic management, cooperative driving, smart driving, urban road, swarm intelligence, connected vehicles

Procedia PDF Downloads 332
1591 Implementation of Successive Interference Cancellation Algorithms in the 5g Downlink

Authors: Mokrani Mohamed Amine

Abstract:

In this paper, we have implemented successive interference cancellation algorithms in the 5G downlink. We have calculated the maximum throughput in Frequency Division Duplex (FDD) mode in the downlink, where we have obtained a value equal to 836932 b/ms. The transmitter is of type Multiple Input Multiple Output (MIMO) with eight transmitting and receiving antennas. Each antenna among eight transmits simultaneously a data rate of 104616 b/ms that contains the binary messages of the three users; in this case, the Cyclic Redundancy Check CRC is negligible, and the MIMO category is the spatial diversity. The technology used for this is called Non-Orthogonal Multiple Access (NOMA) with a Quadrature Phase Shift Keying (QPSK) modulation. The transmission is done in a Rayleigh fading channel with the presence of obstacles. The MIMO Successive Interference Cancellation (SIC) receiver with two transmitting and receiving antennas recovers its binary message without errors for certain values of transmission power such as 50 dBm, with 0.054485% errors when the transmitted power is 20dBm and with 0.00286763% errors for a transmitted power of 32 dBm(in the case of user 1) as well as with 0.0114705% errors when the transmitted power is 20 dBm also with 0.00286763% errors for a power of 24 dBm(in the case of user2) by applying the steps involved in SIC.

Keywords: 5G, NOMA, QPSK, TBS, LDPC, SIC, capacity

Procedia PDF Downloads 103
1590 A Genetic Algorithm Approach to Solve a Weaving Job Scheduling Problem, Aiming Tardiness Minimization

Authors: Carolina Silva, João Nuno Oliveira, Rui Sousa, João Paulo Silva

Abstract:

This study uses genetic algorithms to solve a job scheduling problem in a weaving factory. The underline problem regards an NP-Hard problem concerning unrelated parallel machines, with sequence-dependent setup times. This research uses real data regarding a weaving industry located in the North of Portugal, with a capacity of 96 looms and a production, on average, of 440000 meters of fabric per month. Besides, this study includes a high level of complexity once most of the real production constraints are applied, and several real data instances are tested. Topics such as data analyses and algorithm performance are addressed and tested, to offer a solution that can generate reliable and due date results. All the approaches will be tested in the operational environment, and the KPIs monitored, to understand the solution's impact on the production, with a particular focus on the total number of weeks of late deliveries to clients. Thus, the main goal of this research is to develop a solution that allows for the production of automatically optimized production plans, aiming to the tardiness minimizing.

Keywords: genetic algorithms, textile industry, job scheduling, optimization

Procedia PDF Downloads 157