Search results for: Hash algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3563

Search results for: Hash algorithm

1943 Advanced Mouse Cursor Control and Speech Recognition Module

Authors: Prasad Kalagura, B. Veeresh kumar

Abstract:

We constructed an interface system that would allow a similarly paralyzed user to interact with a computer with almost full functional capability. A real-time tracking algorithm is implemented based on adaptive skin detection and motion analysis. The clicking of the mouse is activated by the user's eye blinking through a sensor. The keyboard function is implemented by voice recognition kit.

Keywords: embedded ARM7 processor, mouse pointer control, voice recognition

Procedia PDF Downloads 559
1942 Hardware-In-The-Loop Relative Motion Control: Theory, Simulation and Experimentation

Authors: O. B. Iskender, K. V. Ling, V. Dubanchet, L. Simonini

Abstract:

This paper presents a Guidance and Control (G&C) strategy to address spacecraft maneuvering problem for future Rendezvous and Docking (RVD) missions. The proposed strategy allows safe and propellant efficient trajectories for space servicing missions including tasks such as approaching, inspecting and capturing. This work provides the validation test results of the G&C laws using a Hardware-In-the-Loop (HIL) setup with two robotic mockups representing the chaser and the target spacecraft. Through this paper, the challenges of the relative motion control in space are first summarized, and in particular, the constraints imposed by the mission, spacecraft and, onboard processing capabilities. Second, the proposed algorithm is introduced by presenting the formulation of constrained Model Predictive Control (MPC) to optimize the fuel consumption and explicitly handle the physical and geometric constraints in the system, e.g. thruster or Line-Of-Sight (LOS) constraints. Additionally, the coupling between translational motion and rotational motion is addressed via dual quaternion based kinematic description and accordingly explained. The resulting convex optimization problem allows real-time implementation capability based on a detailed discussion on the computational time requirements and the obtained results with respect to the onboard computer and future trends of space processors capabilities. Finally, the performance of the algorithm is presented in the scope of a potential future mission and of the available equipment. The results also cover a comparison between the proposed algorithms with Linear–quadratic regulator (LQR) based control law to highlight the clear advantages of the MPC formulation.

Keywords: autonomous vehicles, embedded optimization, real-time experiment, rendezvous and docking, space robotics

Procedia PDF Downloads 105
1941 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer

Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom

Abstract:

Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.

Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN

Procedia PDF Downloads 54
1940 Weight Estimation Using the K-Means Method in Steelmaking’s Overhead Cranes in Order to Reduce Swing Error

Authors: Seyedamir Makinejadsanij

Abstract:

One of the most important factors in the production of quality steel is to know the exact weight of steel in the steelmaking area. In this study, a calculation method is presented to estimate the exact weight of the melt as well as the objects transported by the overhead crane. Iran Alloy Steel Company's steelmaking area has three 90-ton cranes, which are responsible for transferring the ladles and ladle caps between 34 areas in the melt shop. Each crane is equipped with a Disomat Tersus weighing system that calculates and displays real-time weight. The moving object has a variable weight due to swinging, and the weighing system has an error of about +-5%. This means that when the object is moving by a crane, which weighs about 80 tons, the device (Disomat Tersus system) calculates about 4 tons more or 4 tons less, and this is the biggest problem in calculating a real weight. The k-means algorithm is an unsupervised clustering method that was used here. The best result was obtained by considering 3 centers. Compared to the normal average(one) or two, four, five, and six centers, the best answer is with 3 centers, which is logically due to the elimination of noise above and below the real weight. Every day, the standard weight is moved with working cranes to test and calibrate cranes. The results are shown that the accuracy is about 40 kilos per 60 tons (standard weight). As a result, with this method, the accuracy of moving weight is calculated as 99.95%. K-means is used to calculate the exact mean of objects. The stopping criterion of the algorithm is also the number of 1000 repetitions or not moving the points between the clusters. As a result of the implementation of this system, the crane operator does not stop while moving objects and continues his activity regardless of weight calculations. Also, production speed increased, and human error decreased.

Keywords: k-means, overhead crane, melt weight, weight estimation, swing problem

Procedia PDF Downloads 77
1939 A Study on the Different Components of a Typical Back-Scattered Chipless RFID Tag Reflection

Authors: Fatemeh Babaeian, Nemai Chandra Karmakar

Abstract:

Chipless RFID system is a wireless system for tracking and identification which use passive tags for encoding data. The advantage of using chipless RFID tag is having a planar tag which is printable on different low-cost materials like paper and plastic. The printed tag can be attached to different items in the labelling level. Since the price of chipless RFID tag can be as low as a fraction of a cent, this technology has the potential to compete with the conventional optical barcode labels. However, due to the passive structure of the tag, data processing of the reflection signal is a crucial challenge. The captured reflected signal from a tag attached to an item consists of different components which are the reflection from the reader antenna, the reflection from the item, the tag structural mode RCS component and the antenna mode RCS of the tag. All these components are summed up in both time and frequency domains. The effect of reflection from the item and the structural mode RCS component can distort/saturate the frequency domain signal and cause difficulties in extracting the desired component which is the antenna mode RCS. Therefore, it is required to study the reflection of the tag in both time and frequency domains to have a better understanding of the nature of the captured chipless RFID signal. The other benefits of this study can be to find an optimised encoding technique in tag design level and to find the best processing algorithm the chipless RFID signal in decoding level. In this paper, the reflection from a typical backscattered chipless RFID tag with six resonances is analysed, and different components of the signal are separated in both time and frequency domains. Moreover, the time domain signal corresponding to each resonator of the tag is studied. The data for this processing was captured from simulation in CST Microwave Studio 2017. The outcome of this study is understanding different components of a measured signal in a chipless RFID system and a discovering a research gap which is a need to find an optimum detection algorithm for tag ID extraction.

Keywords: antenna mode RCS, chipless RFID tag, resonance, structural mode RCS

Procedia PDF Downloads 168
1938 A Radiomics Approach to Predict the Evolution of Prostate Imaging Reporting and Data System Score 3/5 Prostate Areas in Multiparametric Magnetic Resonance

Authors: Natascha C. D'Amico, Enzo Grossi, Giovanni Valbusa, Ala Malasevschi, Gianpiero Cardone, Sergio Papa

Abstract:

Purpose: To characterize, through a radiomic approach, the nature of areas classified PI-RADS (Prostate Imaging Reporting and Data System) 3/5, recognized in multiparametric prostate magnetic resonance with T2-weighted (T2w), diffusion and perfusion sequences with paramagnetic contrast. Methods and Materials: 24 cases undergoing multiparametric prostate MR and biopsy were admitted to this pilot study. Clinical outcome of the PI-RADS 3/5 was found through biopsy, finding 8 malignant tumours. The analysed images were acquired with a Philips achieva 1.5T machine with a CE- T2-weighted sequence in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using 3DSlicer image analysis software. 45 shape-based, intensity-based and texture-based features were extracted and represented the input for preprocessing. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and testing set and select features yielding the maximal amount of information. After this pre-processing 20 input variables were selected and different machine learning systems were used to develop a predictive model based on a training testing crossover procedure. Results: The best machine learning system (three-layers feed-forward neural network) obtained a global accuracy of 90% ( 80 % sensitivity and 100% specificity ) with a ROC of 0.82. Conclusion: Machine learning systems coupled with radiomics show a promising potential in distinguishing benign from malign tumours in PI-RADS 3/5 areas.

Keywords: machine learning, MR prostate, PI-Rads 3, radiomics

Procedia PDF Downloads 172
1937 Logical-Probabilistic Modeling of the Reliability of Complex Systems

Authors: Sergo Tsiramua, Sulkhan Sulkhanishvili, Elisabed Asabashvili, Lazare Kvirtia

Abstract:

The paper presents logical-probabilistic methods, models, and algorithms for reliability assessment of complex systems, based on which a web application for structural analysis and reliability assessment of systems was created. It is important to design systems based on structural analysis, research, and evaluation of efficiency indicators. One of the important efficiency criteria is the reliability of the system, which depends on the components of the structure. Quantifying the reliability of large-scale systems is a computationally complex process, and it is advisable to perform it with the help of a computer. Logical-probabilistic modeling is one of the effective means of describing the structure of a complex system and quantitatively evaluating its reliability, which was the basis of our application. The reliability assessment process included the following stages, which were reflected in the application: 1) Construction of a graphical scheme of the structural reliability of the system; 2) Transformation of the graphic scheme into a logical representation and modeling of the shortest ways of successful functioning of the system; 3) Description of system operability condition with logical function in the form of disjunctive normal form (DNF); 4) Transformation of DNF into orthogonal disjunction normal form (ODNF) using the orthogonalization algorithm; 5) Replacing logical elements with probabilistic elements in ODNF, obtaining a reliability estimation polynomial and quantifying reliability; 6) Calculation of “weights” of elements of system. Using the logical-probabilistic methods, models and algorithms discussed in the paper, a special software was created, by means of which a quantitative assessment of the reliability of systems of a complex structure is produced. As a result, structural analysis of systems, research, and designing of optimal structure systems are carried out.

Keywords: complex systems, logical-probabilistic methods, orthogonalization algorithm, reliability of systems, “weights” of elements

Procedia PDF Downloads 45
1936 Category-Base Theory of the Optimum Signal Approximation Clarifying the Importance of Parallel Worlds in the Recognition of Human and Application to Secure Signal Communication with Feedback

Authors: Takuro Kida, Yuichi Kida

Abstract:

We show a base of the new trend of algorithm mathematically that treats a historical reason of continuous discrimination in the world as well as its solution by introducing new concepts of parallel world that includes an invisible set of errors as its companion. With respect to a matrix operator-filter bank that the matrix operator-analysis-filter bank H and the matrix operator-sampling-filter bank S are given, firstly, we introduce the detailed algorithm to derive the optimum matrix operator-synthesis-filter bank Z that minimizes all the worst-case measures of the matrix operator-error-signals E(ω) = F(ω) − Y(ω) between the matrix operator-input-signals F(ω) and the matrix operator-output signals Y(ω) of the matrix operator-filter bank at the same time. Further, feedback is introduced to the above approximation theory and it is indicated that introducing conversations with feedback does not superior automatically to the accumulation of existing knowledge of signal prediction. Secondly, the concept of category in the field of mathematics is applied to the above optimum signal approximation and is indicated that the category-based approximation theory is applied to the set-theoretic consideration of the recognition of humans. Based on this discussion, it is shown naturally why the narrow perception that tends to create isolation shows an apparent advantage in the short term and, often, why such narrow thinking becomes intimate with discriminatory action in a human group. Throughout these considerations, it is presented that, in order to abolish easy and intimate discriminatory behavior, it is important to create a parallel world of conception where we share the set of invisible error signals, including the words and the consciousness of both worlds.

Keywords: signal prediction, pseudo inverse matrix, artificial intelligence, conditional optimization

Procedia PDF Downloads 136
1935 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 95
1934 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 304
1933 An Exponential Field Path Planning Method for Mobile Robots Integrated with Visual Perception

Authors: Magdy Roman, Mostafa Shoeib, Mostafa Rostom

Abstract:

Global vision, whether provided by overhead fixed cameras, on-board aerial vehicle cameras, or satellite images can always provide detailed information on the environment around mobile robots. In this paper, an intelligent vision-based method of path planning and obstacle avoidance for mobile robots is presented. The method integrates visual perception with a new proposed field-based path-planning method to overcome common path-planning problems such as local minima, unreachable destination and unnecessary lengthy paths around obstacles. The method proposes an exponential angle deviation field around each obstacle that affects the orientation of a close robot. As the robot directs toward, the goal point obstacles are classified into right and left groups, and a deviation angle is exponentially added or subtracted to the orientation of the robot. Exponential field parameters are chosen based on Lyapunov stability criterion to guarantee robot convergence to the destination. The proposed method uses obstacles' shape and location, extracted from global vision system, through a collision prediction mechanism to decide whether to activate or deactivate obstacles field. In addition, a search mechanism is developed in case of robot or goal point is trapped among obstacles to find suitable exit or entrance. The proposed algorithm is validated both in simulation and through experiments. The algorithm shows effectiveness in obstacles' avoidance and destination convergence, overcoming common path planning problems found in classical methods.

Keywords: path planning, collision avoidance, convergence, computer vision, mobile robots

Procedia PDF Downloads 173
1932 A Multi-Modal Virtual Walkthrough of the Virtual Past and Present Based on Panoramic View, Crowd Simulation and Acoustic Heritage on Mobile Platform

Authors: Lim Chen Kim, Tan Kian Lam, Chan Yi Chee

Abstract:

This research presents a multi-modal simulation in the reconstruction of the past and the construction of present in digital cultural heritage on mobile platform. In bringing the present life, the virtual environment is generated through a presented scheme for rapid and efficient construction of 360° panoramic view. Then, acoustical heritage model and crowd model are presented and improvised into the 360° panoramic view. For the reconstruction of past life, the crowd is simulated and rendered in an old trading port. However, the keystone of this research is in a virtual walkthrough that shows the virtual present life in 2D and virtual past life in 3D, both in an environment of virtual heritage sites in George Town through mobile device. Firstly, the 2D crowd is modelled and simulated using OpenGL ES 1.1 on mobile platform. The 2D crowd is used to portray the present life in 360° panoramic view of a virtual heritage environment based on the extension of Newtonian Laws. Secondly, the 2D crowd is animated and rendered into 3D with improved variety and incorporated into the virtual past life using Unity3D Game Engine. The behaviours of the 3D models are then simulated based on the enhancement of the classical model of Boid algorithm. Finally, a demonstration system is developed and integrated with the models, techniques and algorithms of this research. The virtual walkthrough is demonstrated to a group of respondents and is evaluated through the user-centred evaluation by navigating around the demonstration system. The results of the evaluation based on the questionnaires have shown that the presented virtual walkthrough has been successfully deployed through a multi-modal simulation and such a virtual walkthrough would be particularly useful in a virtual tour and virtual museum applications.

Keywords: Boid Algorithm, Crowd Simulation, Mobile Platform, Newtonian Laws, Virtual Heritage

Procedia PDF Downloads 253
1931 Heliport Remote Safeguard System Based on Real-Time Stereovision 3D Reconstruction Algorithm

Authors: Ł. Morawiński, C. Jasiński, M. Jurkiewicz, S. Bou Habib, M. Bondyra

Abstract:

With the development of optics, electronics, and computers, vision systems are increasingly used in various areas of life, science, and industry. Vision systems have a huge number of applications. They can be used in quality control, object detection, data reading, e.g., QR-code, etc. A large part of them is used for measurement purposes. Some of them make it possible to obtain a 3D reconstruction of the tested objects or measurement areas. 3D reconstruction algorithms are mostly based on creating depth maps from data that can be acquired from active or passive methods. Due to the specific appliance in airfield technology, only passive methods are applicable because of other existing systems working on the site, which can be blinded on most spectral levels. Furthermore, reconstruction is required to work long distances ranging from hundreds of meters to tens of kilometers with low loss of accuracy even with harsh conditions such as fog, rain, or snow. In response to those requirements, HRESS (Heliport REmote Safeguard System) was developed; which main part is a rotational head with a two-camera stereovision rig gathering images around the head in 360 degrees along with stereovision 3D reconstruction and point cloud combination. The sub-pixel analysis introduced in the HRESS system makes it possible to obtain an increased distance measurement resolution and accuracy of about 3% for distances over one kilometer. Ultimately, this leads to more accurate and reliable measurement data in the form of a point cloud. Moreover, the program algorithm introduces operations enabling the filtering of erroneously collected data in the point cloud. All activities from the programming, mechanical and optical side are aimed at obtaining the most accurate 3D reconstruction of the environment in the measurement area.

Keywords: airfield monitoring, artificial intelligence, stereovision, 3D reconstruction

Procedia PDF Downloads 101
1930 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 145
1929 The Design and Implementation of an Enhanced 2D Mesh Switch

Authors: Manel Langar, Riad Bourguiba, Jaouhar Mouine

Abstract:

In this paper, we propose the design and implementation of an enhanced wormhole virtual channel on chip router. It is a heart of a mesh NoC using the XY deterministic routing algorithm. It is characterized by its simple virtual channel allocation strategy which allows reducing area and complexity of connections without affecting the performance. We implemented our router on a Tezzaron process to validate its performances. This router is a basic element that will be used later to design a 3D mesh NoC.

Keywords: NoC, mesh, router, 3D NoC

Procedia PDF Downloads 542
1928 Investigating the Algorithm to Maintain a Constant Speed in the Wankel Engine

Authors: Adam Majczak, Michał Bialy, Zbigniew Czyż, Zdzislaw Kaminski

Abstract:

Increasingly stringent emission standards for passenger cars require us to find alternative drives. The share of electric vehicles in the sale of new cars increases every year. However, their performance and, above all, range cannot be today successfully compared to those of cars with a traditional internal combustion engine. Battery recharging lasts hours, which can be hardly accepted due to the time needed to refill a fuel tank. Therefore, the ways to reduce the adverse features of cars equipped with electric motors only are searched for. One of the methods is a combination of an electric engine as a main source of power and a small internal combustion engine as an electricity generator. This type of drive enables an electric vehicle to achieve a radically increased range and low emissions of toxic substances. For several years, the leading automotive manufacturers like the Mazda and the Audi together with the best companies in the automotive industry, e.g., AVL have developed some electric drive systems capable of recharging themselves while driving, known as a range extender. An electricity generator is powered by a Wankel engine that has seemed to pass into history. This low weight and small engine with a rotating piston and a very low vibration level turned out to be an excellent source in such applications. Its operation as an energy source for a generator almost entirely eliminates its disadvantages like high fuel consumption, high emission of toxic substances, or short lifetime typical of its traditional application. The operation of the engine at a constant rotational speed enables a significant increase in its lifetime, and its small external dimensions enable us to make compact modules to drive even small urban cars like the Audi A1 or the Mazda 2. The algorithm to maintain a constant speed was investigated on the engine dynamometer with an eddy current brake and the necessary measuring apparatus. The research object was the Aixro XR50 rotary engine with the electronic power supply developed at the Lublin University of Technology. The load torque of the engine was altered during the research by means of the eddy current brake capable of giving any number of load cycles. The parameters recorded included speed and torque as well as a position of a throttle in an inlet system. Increasing and decreasing load did not significantly change engine speed, which means that control algorithm parameters are correctly selected. This work has been financed by the Polish Ministry of Science and Higher Education.

Keywords: electric vehicle, power generator, range extender, Wankel engine

Procedia PDF Downloads 138
1927 Ensemble Machine Learning Approach for Estimating Missing Data from CO₂ Time Series

Authors: Atbin Mahabbati, Jason Beringer, Matthias Leopold

Abstract:

To address the global challenges of climate and environmental changes, there is a need for quantifying and reducing uncertainties in environmental data, including observations of carbon, water, and energy. Global eddy covariance flux tower networks (FLUXNET), and their regional counterparts (i.e., OzFlux, AmeriFlux, China Flux, etc.) were established in the late 1990s and early 2000s to address the demand. Despite the capability of eddy covariance in validating process modelling analyses, field surveys and remote sensing assessments, there are some serious concerns regarding the challenges associated with the technique, e.g. data gaps and uncertainties. To address these concerns, this research has developed an ensemble model to fill the data gaps of CO₂ flux to avoid the limitations of using a single algorithm, and therefore, provide less error and decline the uncertainties associated with the gap-filling process. In this study, the data of five towers in the OzFlux Network (Alice Springs Mulga, Calperum, Gingin, Howard Springs and Tumbarumba) during 2013 were used to develop an ensemble machine learning model, using five feedforward neural networks (FFNN) with different structures combined with an eXtreme Gradient Boosting (XGB) algorithm. The former methods, FFNN, provided the primary estimations in the first layer, while the later, XGB, used the outputs of the first layer as its input to provide the final estimations of CO₂ flux. The introduced model showed slight superiority over each single FFNN and the XGB, while each of these two methods was used individually, overall RMSE: 2.64, 2.91, and 3.54 g C m⁻² yr⁻¹ respectively (3.54 provided by the best FFNN). The most significant improvement happened to the estimation of the extreme diurnal values (during midday and sunrise), as well as nocturnal estimations, which is generally considered as one of the most challenging parts of CO₂ flux gap-filling. The towers, as well as seasonality, showed different levels of sensitivity to improvements provided by the ensemble model. For instance, Tumbarumba showed more sensitivity compared to Calperum, where the differences between the Ensemble model on the one hand and the FFNNs and XGB, on the other hand, were the least of all 5 sites. Besides, the performance difference between the ensemble model and its components individually were more significant during the warm season (Jan, Feb, Mar, Oct, Nov, and Dec) compared to the cold season (Apr, May, Jun, Jul, Aug, and Sep) due to the higher amount of photosynthesis of plants, which led to a larger range of CO₂ exchange. In conclusion, the introduced ensemble model slightly improved the accuracy of CO₂ flux gap-filling and robustness of the model. Therefore, using ensemble machine learning models is potentially capable of improving data estimation and regression outcome when it seems to be no more room for improvement while using a single algorithm.

Keywords: carbon flux, Eddy covariance, extreme gradient boosting, gap-filling comparison, hybrid model, OzFlux network

Procedia PDF Downloads 117
1926 Feature Evaluation Based on Random Subspace and Multiple-K Ensemble

Authors: Jaehong Yu, Seoung Bum Kim

Abstract:

Clustering analysis can facilitate the extraction of intrinsic patterns in a dataset and reveal its natural groupings without requiring class information. For effective clustering analysis in high dimensional datasets, unsupervised dimensionality reduction is an important task. Unsupervised dimensionality reduction can generally be achieved by feature extraction or feature selection. In many situations, feature selection methods are more appropriate than feature extraction methods because of their clear interpretation with respect to the original features. The unsupervised feature selection can be categorized as feature subset selection and feature ranking method, and we focused on unsupervised feature ranking methods which evaluate the features based on their importance scores. Recently, several unsupervised feature ranking methods were developed based on ensemble approaches to achieve their higher accuracy and stability. However, most of the ensemble-based feature ranking methods require the true number of clusters. Furthermore, these algorithms evaluate the feature importance depending on the ensemble clustering solution, and they produce undesirable evaluation results if the clustering solutions are inaccurate. To address these limitations, we proposed an ensemble-based feature ranking method with random subspace and multiple-k ensemble (FRRM). The proposed FRRM algorithm evaluates the importance of each feature with the random subspace ensemble, and all evaluation results are combined with the ensemble importance scores. Moreover, FRRM does not require the determination of the true number of clusters in advance through the use of the multiple-k ensemble idea. Experiments on various benchmark datasets were conducted to examine the properties of the proposed FRRM algorithm and to compare its performance with that of existing feature ranking methods. The experimental results demonstrated that the proposed FRRM outperformed the competitors.

Keywords: clustering analysis, multiple-k ensemble, random subspace-based feature evaluation, unsupervised feature ranking

Procedia PDF Downloads 313
1925 Modified Weibull Approach for Bridge Deterioration Modelling

Authors: Niroshan K. Walgama Wellalage, Tieling Zhang, Richard Dwight

Abstract:

State-based Markov deterioration models (SMDM) sometimes fail to find accurate transition probability matrix (TPM) values, and hence lead to invalid future condition prediction or incorrect average deterioration rates mainly due to drawbacks of existing nonlinear optimization-based algorithms and/or subjective function types used for regression analysis. Furthermore, a set of separate functions for each condition state with age cannot be directly derived by using Markov model for a given bridge element group, which however is of interest to industrial partners. This paper presents a new approach for generating Homogeneous SMDM model output, namely, the Modified Weibull approach, which consists of a set of appropriate functions to describe the percentage condition prediction of bridge elements in each state. These functions are combined with Bayesian approach and Metropolis Hasting Algorithm (MHA) based Markov Chain Monte Carlo (MCMC) simulation technique for quantifying the uncertainty in model parameter estimates. In this study, factors contributing to rail bridge deterioration were identified. The inspection data for 1,000 Australian railway bridges over 15 years were reviewed and filtered accordingly based on the real operational experience. Network level deterioration model for a typical bridge element group was developed using the proposed Modified Weibull approach. The condition state predictions obtained from this method were validated using statistical hypothesis tests with a test data set. Results show that the proposed model is able to not only predict the conditions in network-level accurately but also capture the model uncertainties with given confidence interval.

Keywords: bridge deterioration modelling, modified weibull approach, MCMC, metropolis-hasting algorithm, bayesian approach, Markov deterioration models

Procedia PDF Downloads 708
1924 Graphical Theoretical Construction of Discrete time Share Price Paths from Matroid

Authors: Min Wang, Sergey Utev

Abstract:

The lessons from the 2007-09 global financial crisis have driven scientific research, which considers the design of new methodologies and financial models in the global market. The quantum mechanics approach was introduced in the unpredictable stock market modeling. One famous quantum tool is Feynman path integral method, which was used to model insurance risk by Tamturk and Utev and adapted to formalize the path-dependent option pricing by Hao and Utev. The research is based on the path-dependent calculation method, which is motivated by the Feynman path integral method. The path calculation can be studied in two ways, one way is to label, and the other is computational. Labeling is a part of the representation of objects, and generating functions can provide many different ways of representing share price paths. In this paper, the recent works on graphical theoretical construction of individual share price path via matroid is presented. Firstly, a study is done on the knowledge of matroid, relationship between lattice path matroid and Tutte polynomials and ways to connect points in the lattice path matroid and Tutte polynomials is suggested. Secondly, It is found that a general binary tree can be validly constructed from a connected lattice path matroid rather than general lattice path matroid. Lastly, it is suggested that there is a way to represent share price paths via a general binary tree, and an algorithm is developed to construct share price paths from general binary trees. A relationship is also provided between lattice integer points and Tutte polynomials of a transversal matroid. Use this way of connection together with the algorithm, a share price path can be constructed from a given connected lattice path matroid.

Keywords: combinatorial construction, graphical representation, matroid, path calculation, share price, Tutte polynomial

Procedia PDF Downloads 120
1923 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 72
1922 Probabilistic Graphical Model for the Web

Authors: M. Nekri, A. Khelladi

Abstract:

The world wide web network is a network with a complex topology, the main properties of which are the distribution of degrees in power law, A low clustering coefficient and a weak average distance. Modeling the web as a graph allows locating the information in little time and consequently offering a help in the construction of the research engine. Here, we present a model based on the already existing probabilistic graphs with all the aforesaid characteristics. This work will consist in studying the web in order to know its structuring thus it will enable us to modelize it more easily and propose a possible algorithm for its exploration.

Keywords: clustering coefficient, preferential attachment, small world, web community

Procedia PDF Downloads 251
1921 Neural Network Based Control Algorithm for Inhabitable Spaces Applying Emotional Domotics

Authors: Sergio A. Navarro Tuch, Martin Rogelio Bustamante Bello, Leopoldo Julian Lechuga Lopez

Abstract:

In recent years, Mexico’s population has seen a rise of different physiological and mental negative states. Two main consequences of this problematic are deficient work performance and high levels of stress generating and important impact on a person’s physical, mental and emotional health. Several approaches, such as the use of audiovisual stimulus to induce emotions and modify a person’s emotional state, can be applied in an effort to decreases these negative effects. With the use of different non-invasive physiological sensors such as EEG, luminosity and face recognition we gather information of the subject’s current emotional state. In a controlled environment, a subject is shown a series of selected images from the International Affective Picture System (IAPS) in order to induce a specific set of emotions and obtain information from the sensors. The raw data obtained is statistically analyzed in order to filter only the specific groups of information that relate to a subject’s emotions and current values of the physical variables in the controlled environment such as, luminosity, RGB light color, temperature, oxygen level and noise. Finally, a neural network based control algorithm is given the data obtained in order to feedback the system and automate the modification of the environment variables and audiovisual content shown in an effort that these changes can positively alter the subject’s emotional state. During the research, it was found that the light color was directly related to the type of impact generated by the audiovisual content on the subject’s emotional state. Red illumination increased the impact of violent images and green illumination along with relaxing images decreased the subject’s levels of anxiety. Specific differences between men and women were found as to which type of images generated a greater impact in either gender. The population sample was mainly constituted by college students whose data analysis showed a decreased sensibility to violence towards humans. Despite the early stage of the control algorithm, the results obtained from the population sample give us a better insight into the possibilities of emotional domotics and the applications that can be created towards the improvement of performance in people’s lives. The objective of this research is to create a positive impact with the application of technology to everyday activities; nonetheless, an ethical problem arises since this can also be applied to control a person’s emotions and shift their decision making.

Keywords: data analysis, emotional domotics, performance improvement, neural network

Procedia PDF Downloads 121
1920 Separating Landform from Noise in High-Resolution Digital Elevation Models through Scale-Adaptive Window-Based Regression

Authors: Anne M. Denton, Rahul Gomes, David W. Franzen

Abstract:

High-resolution elevation data are becoming increasingly available, but typical approaches for computing topographic features, like slope and curvature, still assume small sliding windows, for example, of size 3x3. That means that the digital elevation model (DEM) has to be resampled to the scale of the landform features that are of interest. Any higher resolution is lost in this resampling. When the topographic features are computed through regression that is performed at the resolution of the original data, the accuracy can be much higher, and the reported result can be adjusted to the length scale that is relevant locally. Slope and variance are calculated for overlapping windows, meaning that one regression result is computed per raster point. The number of window centers per area is the same for the output as for the original DEM. Slope and variance are computed by performing regression on the points in the surrounding window. Such an approach is computationally feasible because of the additive nature of regression parameters and variance. Any doubling of window size in each direction only takes a single pass over the data, corresponding to a logarithmic scaling of the resulting algorithm as a function of the window size. Slope and variance are stored for each aggregation step, allowing the reported slope to be selected to minimize variance. The approach thereby adjusts the effective window size to the landform features that are characteristic to the area within the DEM. Starting with a window size of 2x2, each iteration aggregates 2x2 non-overlapping windows from the previous iteration. Regression results are stored for each iteration, and the slope at minimal variance is reported in the final result. As such, the reported slope is adjusted to the length scale that is characteristic of the landform locally. The length scale itself and the variance at that length scale are also visualized to aid in interpreting the results for slope. The relevant length scale is taken to be half of the window size of the window over which the minimum variance was achieved. The resulting process was evaluated for 1-meter DEM data and for artificial data that was constructed to have defined length scales and added noise. A comparison with ESRI ArcMap was performed and showed the potential of the proposed algorithm. The resolution of the resulting output is much higher and the slope and aspect much less affected by noise. Additionally, the algorithm adjusts to the scale of interest within the region of the image. These benefits are gained without additional computational cost in comparison with resampling the DEM and computing the slope over 3x3 images in ESRI ArcMap for each resolution. In summary, the proposed approach extracts slope and aspect of DEMs at the lengths scales that are characteristic locally. The result is of higher resolution and less affected by noise than existing techniques.

Keywords: high resolution digital elevation models, multi-scale analysis, slope calculation, window-based regression

Procedia PDF Downloads 109
1919 Applying Multiplicative Weight Update to Skin Cancer Classifiers

Authors: Animish Jain

Abstract:

This study deals with using Multiplicative Weight Update within artificial intelligence and machine learning to create models that can diagnose skin cancer using microscopic images of cancer samples. In this study, the multiplicative weight update method is used to take the predictions of multiple models to try and acquire more accurate results. Logistic Regression, Convolutional Neural Network (CNN), and Support Vector Machine Classifier (SVMC) models are employed within the Multiplicative Weight Update system. These models are trained on pictures of skin cancer from the ISIC-Archive, to look for patterns to label unseen scans as either benign or malignant. These models are utilized in a multiplicative weight update algorithm which takes into account the precision and accuracy of each model through each successive guess to apply weights to their guess. These guesses and weights are then analyzed together to try and obtain the correct predictions. The research hypothesis for this study stated that there would be a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The SVMC model had an accuracy of 77.88%. The CNN model had an accuracy of 85.30%. The Logistic Regression model had an accuracy of 79.09%. Using Multiplicative Weight Update, the algorithm received an accuracy of 72.27%. The final conclusion that was drawn was that there was a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The conclusion was made that using a CNN model would be the best option for this problem rather than a Multiplicative Weight Update system. This is due to the possibility that Multiplicative Weight Update is not effective in a binary setting where there are only two possible classifications. In a categorical setting with multiple classes and groupings, a Multiplicative Weight Update system might become more proficient as it takes into account the strengths of multiple different models to classify images into multiple categories rather than only two categories, as shown in this study. This experimentation and computer science project can help to create better algorithms and models for the future of artificial intelligence in the medical imaging field.

Keywords: artificial intelligence, machine learning, multiplicative weight update, skin cancer

Procedia PDF Downloads 53
1918 Development of Wave-Dissipating Block Installation Simulation for Inexperienced Worker Training

Authors: Hao Min Chuah, Tatsuya Yamazaki, Ryosui Iwasawa, Tatsumi Suto

Abstract:

In recent years, with the advancement of digital technology, the movement to introduce so-called ICT (Information and Communication Technology), such as computer technology and network technology, to civil engineering construction sites and construction sites is accelerating. As part of this movement, attempts are being made in various situations to reproduce actual sites inside computers and use them for designing and construction planning, as well as for training inexperienced engineers. The installation of wave-dissipating blocks on coasts, etc., is a type of work that has been carried out by skilled workers based on their years of experience and is one of the tasks that is difficult for inexperienced workers to carry out on site. Wave-dissipating blocks are structures that are designed to protect coasts, beaches, and so on from erosion by reducing the energy of ocean waves. Wave-dissipating blocks usually weigh more than 1 t and are installed by being suspended by a crane, so it would be time-consuming and costly for inexperienced workers to train on-site. In this paper, therefore, a block installation simulator is developed based on Unity 3D, a game development engine. The simulator computes porosity. Porosity is defined as the ratio of the total volume of the wave breaker blocks inside the structure to the final shape of the ideal structure. Using the evaluation of porosity, the simulator can determine how well the user is able to install the blocks. The voxelization technique is used to calculate the porosity of the structure, simplifying the calculations. Other techniques, such as raycasting and box overlapping, are employed for accurate simulation. In the near future, the simulator will install an automatic block installation algorithm based on combinatorial optimization solutions and compare the user-demonstrated block installation and the appropriate installation solved by the algorithm.

Keywords: 3D simulator, porosity, user interface, voxelization, wave-dissipating blocks

Procedia PDF Downloads 78
1917 Analysis of a IncResU-Net Model for R-Peak Detection in ECG Signals

Authors: Beatriz Lafuente Alcázar, Yash Wani, Amit J. Nimunkar

Abstract:

Cardiovascular Diseases (CVDs) are the leading cause of death globally, and around 80% of sudden cardiac deaths are due to arrhythmias or irregular heartbeats. The majority of these pathologies are revealed by either short-term or long-term alterations in the electrocardiogram (ECG) morphology. The ECG is the main diagnostic tool in cardiology. It is a non-invasive, pain free procedure that measures the heart’s electrical activity and that allows the detecting of abnormal rhythms and underlying conditions. A cardiologist can diagnose a wide range of pathologies based on ECG’s form alterations, but the human interpretation is subjective and it is contingent to error. Moreover, ECG records can be quite prolonged in time, which can further complicate visual diagnosis, and deeply retard disease detection. In this context, deep learning methods have risen as a promising strategy to extract relevant features and eliminate individual subjectivity in ECG analysis. They facilitate the computation of large sets of data and can provide early and precise diagnoses. Therefore, the cardiology field is one of the areas that can most benefit from the implementation of deep learning algorithms. In the present study, a deep learning algorithm is trained following a novel approach, using a combination of different databases as the training set. The goal of the algorithm is to achieve the detection of R-peaks in ECG signals. Its performance is further evaluated in ECG signals with different origins and features to test the model’s ability to generalize its outcomes. Performance of the model for detection of R-peaks for clean and noisy ECGs is presented. The model is able to detect R-peaks in the presence of various types of noise, and when presented with data, it has not been trained. It is expected that this approach will increase the effectiveness and capacity of cardiologists to detect divergences in the normal cardiac activity of their patients.

Keywords: arrhythmia, deep learning, electrocardiogram, machine learning, R-peaks

Procedia PDF Downloads 154
1916 Numerical Simulation of Filtration Gas Combustion: Front Propagation Velocity

Authors: Yuri Laevsky, Tatyana Nosova

Abstract:

The phenomenon of filtration gas combustion (FGC) had been discovered experimentally at the beginning of 80’s of the previous century. It has a number of important applications in such areas as chemical technologies, fire-explosion safety, energy-saving technologies, oil production. From the physical point of view, FGC may be defined as the propagation of region of gaseous exothermic reaction in chemically inert porous medium, as the gaseous reactants seep into the region of chemical transformation. The movement of the combustion front has different modes, and this investigation is focused on the low-velocity regime. The main characteristic of the process is the velocity of the combustion front propagation. Computation of this characteristic encounters substantial difficulties because of the strong heterogeneity of the process. The mathematical model of FGC is formed by the energy conservation laws for the temperature of the porous medium and the temperature of gas and the mass conservation law for the relative concentration of the reacting component of the gas mixture. In this case the homogenization of the model is performed with the use of the two-temperature approach when at each point of the continuous medium we specify the solid and gas phases with a Newtonian heat exchange between them. The construction of a computational scheme is based on the principles of mixed finite element method with the usage of a regular mesh. The approximation in time is performed by an explicit–implicit difference scheme. Special attention was given to determination of the combustion front propagation velocity. Straight computation of the velocity as grid derivative leads to extremely unstable algorithm. It is worth to note that the term ‘front propagation velocity’ makes sense for settled motion when some analytical formulae linking velocity and equilibrium temperature are correct. The numerical implementation of one of such formulae leading to the stable computation of instantaneous front velocity has been proposed. The algorithm obtained has been applied in subsequent numerical investigation of the FGC process. This way the dependence of the main characteristics of the process on various physical parameters has been studied. In particular, the influence of the combustible gas mixture consumption on the front propagation velocity has been investigated. It also has been reaffirmed numerically that there is an interval of critical values of the interfacial heat transfer coefficient at which a sort of a breakdown occurs from a slow combustion front propagation to a rapid one. Approximate boundaries of such an interval have been calculated for some specific parameters. All the results obtained are in full agreement with both experimental and theoretical data, confirming the adequacy of the model and the algorithm constructed. The presence of stable techniques to calculate the instantaneous velocity of the combustion wave allows considering the semi-Lagrangian approach to the solution of the problem.

Keywords: filtration gas combustion, low-velocity regime, mixed finite element method, numerical simulation

Procedia PDF Downloads 285
1915 The Location-Routing Problem with Pickup Facilities and Heterogeneous Demand: Formulation and Heuristics Approach

Authors: Mao Zhaofang, Xu Yida, Fang Kan, Fu Enyuan, Zhao Zhao

Abstract:

Nowadays, last-mile distribution plays an increasingly important role in the whole industrial chain delivery link and accounts for a large proportion of the whole distribution process cost. Promoting the upgrading of logistics networks and improving the layout of final distribution points has become one of the trends in the development of modern logistics. Due to the discrete and heterogeneous needs and spatial distribution of customer demand, which will lead to a higher delivery failure rate and lower vehicle utilization, last-mile delivery has become a time-consuming and uncertain process. As a result, courier companies have introduced a range of innovative parcel storage facilities, including pick-up points and lockers. The introduction of pick-up points and lockers has not only improved the users’ experience but has also helped logistics and courier companies achieve large-scale economy. Against the backdrop of the COVID-19 of the previous period, contactless delivery has become a new hotspot, which has also created new opportunities for the development of collection services. Therefore, a key issue for logistics companies is how to design/redesign their last-mile distribution network systems to create integrated logistics and distribution networks that consider pick-up points and lockers. This paper focuses on the introduction of self-pickup facilities in new logistics and distribution scenarios and the heterogeneous demands of customers. In this paper, we consider two types of demand, including ordinary products and refrigerated products, as well as corresponding transportation vehicles. We consider the constraints associated with self-pickup points and lockers and then address the location-routing problem with self-pickup facilities and heterogeneous demands (LRP-PFHD). To solve this challenging problem, we propose a mixed integer linear programming (MILP) model that aims to minimize the total cost, which includes the facility opening cost, the variable transport cost, and the fixed transport cost. Due to the NP-hardness of the problem, we propose a hybrid adaptive large-neighbourhood search algorithm to solve LRP-PFHD. We evaluate the effectiveness and efficiency of the proposed algorithm by using instances generated based on benchmark instances. The results demonstrate that the hybrid adaptive large neighbourhood search algorithm is more efficient than MILP solvers such as Gurobi for LRP-PFHD, especially for large-scale instances. In addition, we made a comprehensive analysis of some important parameters (e.g., facility opening cost and transportation cost) to explore their impacts on the results and suggested helpful managerial insights for courier companies.

Keywords: city logistics, last-mile delivery, location-routing, adaptive large neighborhood search

Procedia PDF Downloads 58
1914 A Comparative Analysis of Classification Models with Wrapper-Based Feature Selection for Predicting Student Academic Performance

Authors: Abdullah Al Farwan, Ya Zhang

Abstract:

In today’s educational arena, it is critical to understand educational data and be able to evaluate important aspects, particularly data on student achievement. Educational Data Mining (EDM) is a research area that focusing on uncovering patterns and information in data from educational institutions. Teachers, if they are able to predict their students' class performance, can use this information to improve their teaching abilities. It has evolved into valuable knowledge that can be used for a wide range of objectives; for example, a strategic plan can be used to generate high-quality education. Based on previous data, this paper recommends employing data mining techniques to forecast students' final grades. In this study, five data mining methods, Decision Tree, JRip, Naive Bayes, Multi-layer Perceptron, and Random Forest with wrapper feature selection, were used on two datasets relating to Portuguese language and mathematics classes lessons. The results showed the effectiveness of using data mining learning methodologies in predicting student academic success. The classification accuracy achieved with selected algorithms lies in the range of 80-94%. Among all the selected classification algorithms, the lowest accuracy is achieved by the Multi-layer Perceptron algorithm, which is close to 70.45%, and the highest accuracy is achieved by the Random Forest algorithm, which is close to 94.10%. This proposed work can assist educational administrators to identify poor performing students at an early stage and perhaps implement motivational interventions to improve their academic success and prevent educational dropout.

Keywords: classification algorithms, decision tree, feature selection, multi-layer perceptron, Naïve Bayes, random forest, students’ academic performance

Procedia PDF Downloads 145