Search results for: forecasting accuracy
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4131

Search results for: forecasting accuracy

3531 Uncertainty of the Brazilian Earth System Model for Solar Radiation

Authors: Elison Eduardo Jardim Bierhals, Claudineia Brazil, Deivid Pires, Rafael Haag, Elton Gimenez Rossini

Abstract:

This study evaluated the uncertainties involved in the solar radiation projections generated by the Brazilian Earth System Model (BESM) of the Weather and Climate Prediction Center (CPTEC) belonging to Coupled Model Intercomparison Phase 5 (CMIP5), with the aim of identifying efficiency in the projections for solar radiation of said model and in this way establish the viability of its use. Two different scenarios elaborated by Intergovernmental Panel on Climate Change (IPCC) were evaluated: RCP 4.5 (with more optimistic contour conditions) and 8.5 (with more pessimistic initial conditions). The method used to verify the accuracy of the present model was the Nash coefficient and the Statistical bias, as it better represents these atmospheric patterns. The BESM showed a tendency to overestimate the data ​​of solar radiation projections in most regions of the state of Rio Grande do Sul and through the validation methods adopted by this study, BESM did not present a satisfactory accuracy.

Keywords: climate changes, projections, solar radiation, uncertainty

Procedia PDF Downloads 250
3530 Mobile Platform’s Attitude Determination Based on Smoothed GPS Code Data and Carrier-Phase Measurements

Authors: Mohamed Ramdani, Hassen Abdellaoui, Abdenour Boudrassen

Abstract:

Mobile platform’s attitude estimation approaches mainly based on combined positioning techniques and developed algorithms; which aim to reach a fast and accurate solution. In this work, we describe the design and the implementation of an attitude determination (AD) process, using only measurements from GPS sensors. The major issue is based on smoothed GPS code data using Hatch filter and raw carrier-phase measurements integrated into attitude algorithm based on vectors measurement using least squares (LSQ) estimation method. GPS dataset from a static experiment is used to investigate the effectiveness of the presented approach and consequently to check the accuracy of the attitude estimation algorithm. Attitude results from GPS multi-antenna over short baselines are introduced and analyzed. The 3D accuracy of estimated attitude parameters using smoothed measurements is over 0.27°.

Keywords: attitude determination, GPS code data smoothing, hatch filter, carrier-phase measurements, least-squares attitude estimation

Procedia PDF Downloads 155
3529 Utilizing Temporal and Frequency Features in Fault Detection of Electric Motor Bearings with Advanced Methods

Authors: Mohammad Arabi

Abstract:

The development of advanced technologies in the field of signal processing and vibration analysis has enabled more accurate analysis and fault detection in electrical systems. This research investigates the application of temporal and frequency features in detecting faults in electric motor bearings, aiming to enhance fault detection accuracy and prevent unexpected failures. The use of methods such as deep learning algorithms and neural networks in this process can yield better results. The main objective of this research is to evaluate the efficiency and accuracy of methods based on temporal and frequency features in identifying faults in electric motor bearings to prevent sudden breakdowns and operational issues. Additionally, the feasibility of using techniques such as machine learning and optimization algorithms to improve the fault detection process is also considered. This research employed an experimental method and random sampling. Vibration signals were collected from electric motors under normal and faulty conditions. After standardizing the data, temporal and frequency features were extracted. These features were then analyzed using statistical methods such as analysis of variance (ANOVA) and t-tests, as well as machine learning algorithms like artificial neural networks and support vector machines (SVM). The results showed that using temporal and frequency features significantly improves the accuracy of fault detection in electric motor bearings. ANOVA indicated significant differences between normal and faulty signals. Additionally, t-tests confirmed statistically significant differences between the features extracted from normal and faulty signals. Machine learning algorithms such as neural networks and SVM also significantly increased detection accuracy, demonstrating high effectiveness in timely and accurate fault detection. This study demonstrates that using temporal and frequency features combined with machine learning algorithms can serve as an effective tool for detecting faults in electric motor bearings. This approach not only enhances fault detection accuracy but also simplifies and streamlines the detection process. However, challenges such as data standardization and the cost of implementing advanced monitoring systems must also be considered. Utilizing temporal and frequency features in fault detection of electric motor bearings, along with advanced machine learning methods, offers an effective solution for preventing failures and ensuring the operational health of electric motors. Given the promising results of this research, it is recommended that this technology be more widely adopted in industrial maintenance processes.

Keywords: electric motor, fault detection, frequency features, temporal features

Procedia PDF Downloads 47
3528 A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning

Authors: Samina Khalid, Shamila Nasreen

Abstract:

Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field of machine learning and pattern recognition, dimensionality reduction is important area, where many approaches have been proposed. In this paper, some widely used feature selection and feature extraction techniques have analyzed with the purpose of how effectively these techniques can be used to achieve high performance of learning algorithms that ultimately improves predictive accuracy of classifier. An endeavor to analyze dimensionality reduction techniques briefly with the purpose to investigate strengths and weaknesses of some widely used dimensionality reduction methods is presented.

Keywords: age related macular degeneration, feature selection feature subset selection feature extraction/transformation, FSA’s, relief, correlation based method, PCA, ICA

Procedia PDF Downloads 496
3527 Modelling Structural Breaks in Stock Price Time Series Using Stochastic Differential Equations

Authors: Daniil Karzanov

Abstract:

This paper studies the effect of quarterly earnings reports on the stock price. The profitability of the stock is modeled by geometric Brownian diffusion and the Constant Elasticity of Variance model. We fit several variations of stochastic differential equations to the pre-and after-report period using the Maximum Likelihood Estimation and Grid Search of parameters method. By examining the change in the model parameters after reports’ publication, the study reveals that the reports have enough evidence to be a structural breakpoint, meaning that all the forecast models exploited are not applicable for forecasting and should be refitted shortly.

Keywords: stock market, earnings reports, financial time series, structural breaks, stochastic differential equations

Procedia PDF Downloads 205
3526 Countering the Bullwhip Effect by Absorbing It Downstream in the Supply Chain

Authors: Geng Cui, Naoto Imura, Katsuhiro Nishinari, Takahiro Ezaki

Abstract:

The bullwhip effect, which refers to the amplification of demand variance as one moves up the supply chain, has been observed in various industries and extensively studied through analytic approaches. Existing methods to mitigate the bullwhip effect, such as decentralized demand information, vendor-managed inventory, and the Collaborative Planning, Forecasting, and Replenishment System, rely on the willingness and ability of supply chain participants to share their information. However, in practice, information sharing is often difficult to realize due to privacy concerns. The purpose of this study is to explore new ways to mitigate the bullwhip effect without the need for information sharing. This paper proposes a 'bullwhip absorption strategy' (BAS) to alleviate the bullwhip effect by absorbing it downstream in the supply chain. To achieve this, a two-stage supply chain system was employed, consisting of a single retailer and a single manufacturer. In each time period, the retailer receives an order generated according to an autoregressive process. Upon receiving the order, the retailer depletes the ordered amount, forecasts future demand based on past records, and places an order with the manufacturer using the order-up-to replenishment policy. The manufacturer follows a similar process. In essence, the mechanism of the model is similar to that of the beer game. The BAS is implemented at the retailer's level to counteract the bullwhip effect. This strategy requires the retailer to reduce the uncertainty in its orders, thereby absorbing the bullwhip effect downstream in the supply chain. The advantage of the BAS is that upstream participants can benefit from a reduced bullwhip effect. Although the retailer may incur additional costs, if the gain in the upstream segment can compensate for the retailer's loss, the entire supply chain will be better off. Two indicators, order variance and inventory variance, were used to quantify the bullwhip effect in relation to the strength of absorption. It was found that implementing the BAS at the retailer's level results in a reduction in both the retailer's and the manufacturer's order variances. However, when examining the impact on inventory variances, a trade-off relationship was observed. The manufacturer's inventory variance monotonically decreases with an increase in absorption strength, while the retailer's inventory variance does not always decrease as the absorption strength grows. This is especially true when the autoregression coefficient has a high value, causing the retailer's inventory variance to become a monotonically increasing function of the absorption strength. Finally, numerical simulations were conducted for verification, and the results were consistent with our theoretical analysis.

Keywords: bullwhip effect, supply chain management, inventory management, demand forecasting, order-to-up policy

Procedia PDF Downloads 74
3525 Using the Smith-Waterman Algorithm to Extract Features in the Classification of Obesity Status

Authors: Rosa Figueroa, Christopher Flores

Abstract:

Text categorization is the problem of assigning a new document to a set of predetermined categories, on the basis of a training set of free-text data that contains documents whose category membership is known. To train a classification model, it is necessary to extract characteristics in the form of tokens that facilitate the learning and classification process. In text categorization, the feature extraction process involves the use of word sequences also known as N-grams. In general, it is expected that documents belonging to the same category share similar features. The Smith-Waterman (SW) algorithm is a dynamic programming algorithm that performs a local sequence alignment in order to determine similar regions between two strings or protein sequences. This work explores the use of SW algorithm as an alternative to feature extraction in text categorization. The dataset used for this purpose, contains 2,610 annotated documents with the classes Obese/Non-Obese. This dataset was represented in a matrix form using the Bag of Word approach. The score selected to represent the occurrence of the tokens in each document was the term frequency-inverse document frequency (TF-IDF). In order to extract features for classification, four experiments were conducted: the first experiment used SW to extract features, the second one used unigrams (single word), the third one used bigrams (two word sequence) and the last experiment used a combination of unigrams and bigrams to extract features for classification. To test the effectiveness of the extracted feature set for the four experiments, a Support Vector Machine (SVM) classifier was tuned using 20% of the dataset. The remaining 80% of the dataset together with 5-Fold Cross Validation were used to evaluate and compare the performance of the four experiments of feature extraction. Results from the tuning process suggest that SW performs better than the N-gram based feature extraction. These results were confirmed by using the remaining 80% of the dataset, where SW performed the best (accuracy = 97.10%, weighted average F-measure = 97.07%). The second best was obtained by the combination of unigrams-bigrams (accuracy = 96.04, weighted average F-measure = 95.97) closely followed by the bigrams (accuracy = 94.56%, weighted average F-measure = 94.46%) and finally unigrams (accuracy = 92.96%, weighted average F-measure = 92.90%).

Keywords: comorbidities, machine learning, obesity, Smith-Waterman algorithm

Procedia PDF Downloads 297
3524 Small Target Recognition Based on Trajectory Information

Authors: Saad Alkentar, Abdulkareem Assalem

Abstract:

Recognizing small targets has always posed a significant challenge in image analysis. Over long distances, the image signal-to-noise ratio tends to be low, limiting the amount of useful information available to detection systems. Consequently, visual target recognition becomes an intricate task to tackle. In this study, we introduce a Track Before Detect (TBD) approach that leverages target trajectory information (coordinates) to effectively distinguish between noise and potential targets. By reframing the problem as a multivariate time series classification, we have achieved remarkable results. Specifically, our TBD method achieves an impressive 97% accuracy in separating target signals from noise within a mere half-second time span (consisting of 10 data points). Furthermore, when classifying the identified targets into our predefined categories—airplane, drone, and bird—we achieve an outstanding classification accuracy of 96% over a more extended period of 1.5 seconds (comprising 30 data points).

Keywords: small targets, drones, trajectory information, TBD, multivariate time series

Procedia PDF Downloads 47
3523 The Effects of Big 6+6 Skill Training on Daily Living Skills for an Adolescent with Intellectual Disability

Authors: Luca Vascelli, Silvia Iacomini, Giada Gueli, Francesca Cavallini, Carlo Cavallini, Federica Berardo

Abstract:

The study was conducted to evaluate the effect of training on Big 6 + 6 motor skills to promote daily living skills. Precision teaching (PT) suggests that improved speed of the component behaviors can lead to better performance of composite skills. This study assessed the effects of the repeated timed practice of component motor skills on speed and accuracy of composite skills related to daily living skills. An 18 years old adolescent with intellectual disability participated. A pre post probe single-subject design was used. The results suggest that the participant was able to perform the component skills at his individual aims (endurance was assessed). The speed and accuracy of composite skills were increased; stability and retention were also measured for the composite skill after the training.

Keywords: big 6+6, daily living skills, intellectual disability, precision teaching

Procedia PDF Downloads 154
3522 Evaluation of Ensemble Classifiers for Intrusion Detection

Authors: M. Govindarajan

Abstract:

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Keywords: data mining, ensemble, radial basis function, support vector machine, accuracy

Procedia PDF Downloads 248
3521 Using Open Source Data and GIS Techniques to Overcome Data Deficiency and Accuracy Issues in the Construction and Validation of Transportation Network: Case of Kinshasa City

Authors: Christian Kapuku, Seung-Young Kho

Abstract:

An accurate representation of the transportation system serving the region is one of the important aspects of transportation modeling. Such representation often requires developing an abstract model of the system elements, which also requires important amount of data, surveys and time. However, in some cases such as in developing countries, data deficiencies, time and budget constraints do not always allow such accurate representation, leaving opportunities to assumptions that may negatively affect the quality of the analysis. With the emergence of Internet open source data especially in the mapping technologies as well as the advances in Geography Information System, opportunities to tackle these issues have raised. Therefore, the objective of this paper is to demonstrate such application through a practical case of the development of the transportation network for the city of Kinshasa. The GIS geo-referencing was used to construct the digitized map of Transportation Analysis Zones using available scanned images. Centroids were then dynamically placed at the center of activities using an activities density map. Next, the road network with its characteristics was built using OpenStreet data and other official road inventory data by intersecting their layers and cleaning up unnecessary links such as residential streets. The accuracy of the final network was then checked, comparing it with satellite images from Google and Bing. For the validation, the final network was exported into Emme3 to check for potential network coding issues. Results show a high accuracy between the built network and satellite images, which can mostly be attributed to the use of open source data.

Keywords: geographic information system (GIS), network construction, transportation database, open source data

Procedia PDF Downloads 167
3520 Software-Defined Architecture and Front-End Optimization for DO-178B Compliant Distance Measuring Equipment

Authors: Farzan Farhangian, Behnam Shakibafar, Bobda Cedric, Rene Jr. Landry

Abstract:

Among the air navigation technologies, many of them are capable of increasing aviation sustainability as well as accuracy improvement in Alternative Positioning, Navigation, and Timing (APNT), especially avionics Distance Measuring Equipment (DME), Very high-frequency Omni-directional Range (VOR), etc. The integration of these air navigation solutions could make a robust and efficient accuracy in air mobility, air traffic management and autonomous operations. Designing a proper RF front-end, power amplifier and software-defined transponder could pave the way for reaching an optimized avionics navigation solution. In this article, the possibility of reaching an optimum front-end to be used with single low-cost Software-Defined Radio (SDR) has been investigated in order to reach a software-defined DME architecture. Our software-defined approach uses the firmware possibilities to design a real-time software architecture compatible with a Multi Input Multi Output (MIMO) BladeRF to estimate an accurate time delay between a Transmission (Tx) and the reception (Rx) channels using the synchronous scheduled communication. We could design a novel power amplifier for the transmission channel of the DME to pass the minimum transmission power. This article also investigates designing proper pair pulses based on the DO-178B avionics standard. Various guidelines have been tested, and the possibility of passing the certification process for each standard term has been analyzed. Finally, the performance of the DME was tested in the laboratory environment using an IFR6000, which showed that the proposed architecture reached an accuracy of less than 0.23 Nautical mile (Nmi) with 98% probability.

Keywords: avionics, DME, software defined radio, navigation

Procedia PDF Downloads 79
3519 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 136
3518 A Neural Network Classifier for Identifying Duplicate Image Entries in Real-Estate Databases

Authors: Sergey Ermolin, Olga Ermolin

Abstract:

A Deep Convolution Neural Network with Triplet Loss is used to identify duplicate images in real-estate advertisements in the presence of image artifacts such as watermarking, cropping, hue/brightness adjustment, and others. The effects of batch normalization, spatial dropout, and various convergence methodologies on the resulting detection accuracy are discussed. For comparative Return-on-Investment study (per industry request), end-2-end performance is benchmarked on both Nvidia Titan GPUs and Intel’s Xeon CPUs. A new real-estate dataset from San Francisco Bay Area is used for this work. Sufficient duplicate detection accuracy is achieved to supplement other database-grounded methods of duplicate removal. The implemented method is used in a Proof-of-Concept project in the real-estate industry.

Keywords: visual recognition, convolutional neural networks, triplet loss, spatial batch normalization with dropout, duplicate removal, advertisement technologies, performance benchmarking

Procedia PDF Downloads 338
3517 Credit Risk Prediction Based on Bayesian Estimation of Logistic Regression Model with Random Effects

Authors: Sami Mestiri, Abdeljelil Farhat

Abstract:

The aim of this current paper is to predict the credit risk of banks in Tunisia, over the period (2000-2005). For this purpose, two methods for the estimation of the logistic regression model with random effects: Penalized Quasi Likelihood (PQL) method and Gibbs Sampler algorithm are applied. By using the information on a sample of 528 Tunisian firms and 26 financial ratios, we show that Bayesian approach improves the quality of model predictions in terms of good classification as well as by the ROC curve result.

Keywords: forecasting, credit risk, Penalized Quasi Likelihood, Gibbs Sampler, logistic regression with random effects, curve ROC

Procedia PDF Downloads 542
3516 Performance of Environmental Efficiency of Energy Iran and Other Middle East Countries

Authors: Bahram Fathi, Mahdi Khodaparast Mashhadi, Masuod Homayounifar

Abstract:

According to 1404 forecasting documentation, among the most fundamental ways of Iran’s success in competition with other regional countries are innovations, efficiency enhancements and domestic productivity. Therefore, in this study, the energy consumption efficiency of Iran and the neighbor countries has been measured in the period between 2007-2012 considering the simultaneous economic activities, CO2 emission, and consumption of energy through data envelopment analysis of undesirable output. The results of the study indicated that the energy efficiency changes in both Iran and the average neighbor countries has been on a descending trend and Iran’s energy efficiency status is not desirable compared to the other countries in the region.

Keywords: energy efficiency, environmental, undesirable output, data envelopment analysis

Procedia PDF Downloads 448
3515 Stock Movement Prediction Using Price Factor and Deep Learning

Authors: Hy Dang, Bo Mei

Abstract:

The development of machine learning methods and techniques has opened doors for investigation in many areas such as medicines, economics, finance, etc. One active research area involving machine learning is stock market prediction. This research paper tries to consider multiple techniques and methods for stock movement prediction using historical price or price factors. The paper explores the effectiveness of some deep learning frameworks for forecasting stock. Moreover, an architecture (TimeStock) is proposed which takes the representation of time into account apart from the price information itself. Our model achieves a promising result that shows a potential approach for the stock movement prediction problem.

Keywords: classification, machine learning, time representation, stock prediction

Procedia PDF Downloads 147
3514 Rainfall-Runoff Forecasting Utilizing Genetic Programming Technique

Authors: Ahmed Najah Ahmed Al-Mahfoodh, Ali Najah Ahmed Al-Mahfoodh, Ahmed Al-Shafie

Abstract:

In this study, genetic programming (GP) technique has been investigated in prediction of set of rainfall-runoff data. To assess the effect of input parameters on the model, the sensitivity analysis was adopted. To evaluate the performance of the proposed model, three statistical indexes were used, namely; Correlation Coefficient (CC), Mean Square Error (MSE) and Correlation of Efficiency (CE). The principle aim of this study is to develop a computationally efficient and robust approach for predict of rainfall-runoff which could reduce the cost and labour for measuring these parameters. This research concentrates on the Johor River in Johor State, Malaysia.

Keywords: genetic programming, prediction, rainfall-runoff, Malaysia

Procedia PDF Downloads 481
3513 The Relationship between Iranian EFL Learners' Multiple Intelligences and Their Performance on Grammar Tests

Authors: Rose Shayeghi, Pejman Hosseinioun

Abstract:

The Multiple Intelligences theory characterizes human intelligence as a multifaceted entity that exists in all human beings with varying degrees. The most important contribution of this theory to the field of English Language Teaching (ELT) is its role in identifying individual differences and designing more learner-centered programs. The present study aims at investigating the relationship between different elements of multiple intelligence and grammar scores. To this end, 63 female Iranian EFL learner selected from among intermediate students participated in the study. The instruments employed were a Nelson English language test, Michigan Grammar Test, and Teele Inventory for Multiple Intelligences (TIMI). The results of Pearson Product-Moment Correlation revealed a significant positive correlation between grammatical accuracy and linguistic as well as interpersonal intelligence. The results of Stepwise Multiple Regression indicated that linguistic intelligence contributed to the prediction of grammatical accuracy.

Keywords: multiple intelligence, grammar, ELT, EFL, TIMI

Procedia PDF Downloads 490
3512 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 140
3511 Comparison of Different Machine Learning Algorithms for Solubility Prediction

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.

Keywords: random forest, machine learning, comparison, feature extraction

Procedia PDF Downloads 40
3510 Implementation of an Image Processing System Using Artificial Intelligence for the Diagnosis of Malaria Disease

Authors: Mohammed Bnebaghdad, Feriel Betouche, Malika Semmani

Abstract:

Image processing become more sophisticated over time due to technological advances, especially artificial intelligence (AI) technology. Currently, AI image processing is used in many areas, including surveillance, industry, science, and medicine. AI in medical image processing can help doctors diagnose diseases faster, with minimal mistakes, and with less effort. Among these diseases is malaria, which remains a major public health challenge in many parts of the world. It affects millions of people every year, particularly in tropical and subtropical regions. Early detection of malaria is essential to prevent serious complications and reduce the burden of the disease. In this paper, we propose and implement a scheme based on AI image processing to enhance malaria disease diagnosis through automated analysis of blood smear images. The scheme is based on the convolutional neural network (CNN) method. So, we have developed a model that classifies infected and uninfected single red cells using images available on Kaggle, as well as real blood smear images obtained from the Central Laboratory of Medical Biology EHS Laadi Flici (formerly El Kettar) in Algeria. The real images were segmented into individual cells using the watershed algorithm in order to match the images from the Kaagle dataset. The model was trained and tested, achieving an accuracy of 99% and 97% accuracy for new real images. This validates that the model performs well with new real images, although with slightly lower accuracy. Additionally, the model has been embedded in a Raspberry Pi4, and a graphical user interface (GUI) was developed to visualize the malaria diagnostic results and facilitate user interaction.

Keywords: medical image processing, malaria parasite, classification, CNN, artificial intelligence

Procedia PDF Downloads 19
3509 Forecasting Market Share of Electric Vehicles in Taiwan Using Conjoint Models and Monte Carlo Simulation

Authors: Li-hsing Shih, Wei-Jen Hsu

Abstract:

Recently, the sale of electrical vehicles (EVs) has increased dramatically due to maturing technology development and decreasing cost. Governments of many countries have made regulations and policies in favor of EVs due to their long-term commitment to net zero carbon emissions. However, due to uncertain factors such as the future price of EVs, forecasting the future market share of EVs is a challenging subject for both the auto industry and local government. This study tries to forecast the market share of EVs using conjoint models and Monte Carlo simulation. The research is conducted in three phases. (1) A conjoint model is established to represent the customer preference structure on purchasing vehicles while five product attributes of both EV and internal combustion engine vehicles (ICEV) are selected. A questionnaire survey is conducted to collect responses from Taiwanese consumers and estimate the part-worth utility functions of all respondents. The resulting part-worth utility functions can be used to estimate the market share, assuming each respondent will purchase the product with the highest total utility. For example, attribute values of an ICEV and a competing EV are given respectively, two total utilities of the two vehicles of a respondent are calculated and then knowing his/her choice. Once the choices of all respondents are known, an estimate of market share can be obtained. (2) Among the attributes, future price is the key attribute that dominates consumers’ choice. This study adopts the assumption of a learning curve to predict the future price of EVs. Based on the learning curve method and past price data of EVs, a regression model is established and the probability distribution function of the price of EVs in 2030 is obtained. (3) Since the future price is a random variable from the results of phase 2, a Monte Carlo simulation is then conducted to simulate the choices of all respondents by using their part-worth utility functions. For instance, using one thousand generated future prices of an EV together with other forecasted attribute values of the EV and an ICEV, one thousand market shares can be obtained with a Monte Carlo simulation. The resulting probability distribution of the market share of EVs provides more information than a fixed number forecast, reflecting the uncertain nature of the future development of EVs. The research results can help the auto industry and local government make more appropriate decisions and future action plans.

Keywords: conjoint model, electrical vehicle, learning curve, Monte Carlo simulation

Procedia PDF Downloads 69
3508 An ALM Matrix Completion Algorithm for Recovering Weather Monitoring Data

Authors: Yuqing Chen, Ying Xu, Renfa Li

Abstract:

The development of matrix completion theory provides new approaches for data gathering in Wireless Sensor Networks (WSN). The existing matrix completion algorithms for WSN mainly consider how to reduce the sampling number without considering the real-time performance when recovering the data matrix. In order to guarantee the recovery accuracy and reduce the recovery time consumed simultaneously, we propose a new ALM algorithm to recover the weather monitoring data. A lot of experiments have been carried out to investigate the performance of the proposed ALM algorithm by using different parameter settings, different sampling rates and sampling models. In addition, we compare the proposed ALM algorithm with some existing algorithms in the literature. Experimental results show that the ALM algorithm can obtain better overall recovery accuracy with less computing time, which demonstrate that the ALM algorithm is an effective and efficient approach for recovering the real world weather monitoring data in WSN.

Keywords: wireless sensor network, matrix completion, singular value thresholding, augmented Lagrange multiplier

Procedia PDF Downloads 384
3507 Artificial Intelligence-Based Detection of Individuals Suffering from Vestibular Disorder

Authors: Dua Hişam, Serhat İkizoğlu

Abstract:

Identifying the problem behind balance disorder is one of the most interesting topics in the medical literature. This study has considerably enhanced the development of artificial intelligence (AI) algorithms applying multiple machine learning (ML) models to sensory data on gait collected from humans to classify between normal people and those suffering from Vestibular System (VS) problems. Although AI is widely utilized as a diagnostic tool in medicine, AI models have not been used to perform feature extraction and identify VS disorders through training on raw data. In this study, three machine learning (ML) models, the Random Forest Classifier (RF), Extreme Gradient Boosting (XGB), and K-Nearest Neighbor (KNN), have been trained to detect VS disorder, and the performance comparison of the algorithms has been made using accuracy, recall, precision, and f1-score. With an accuracy of 95.28 %, Random Forest Classifier (RF) was the most accurate model.

Keywords: vestibular disorder, machine learning, random forest classifier, k-nearest neighbor, extreme gradient boosting

Procedia PDF Downloads 69
3506 Implementation of a Low-Cost Instrumentation for an Open Cycle Wind Tunnel to Evaluate Pressure Coefficient

Authors: Cristian P. Topa, Esteban A. Valencia, Victor H. Hidalgo, Marco A. Martinez

Abstract:

Wind tunnel experiments for aerodynamic profiles display numerous advantages, such as: clean steady laminar flow, controlled environmental conditions, streamlines visualization, and real data acquisition. However, the experiment instrumentation usually is expensive, and hence, each test implies a incremented in design cost. The aim of this work is to select and implement a low-cost static pressure data acquisition system for a NACA 2412 airfoil in an open cycle wind tunnel. This work compares wind tunnel experiment with Computational Fluid Dynamics (CFD) simulation and parametric analysis. The experiment was evaluated at Reynolds of 1.65 e5, with increasing angles from -5° to 15°. The comparison between the approaches show good enough accuracy, between the experiment and CFD, additional parametric analysis results differ widely from the other methods, which complies with the lack of accuracy of the lateral approach due its simplicity.

Keywords: wind tunnel, low cost instrumentation, experimental testing, CFD simulation

Procedia PDF Downloads 180
3505 Effects of Listening to Pleasant Thai Classical Music on Increasing Working Memory in Elderly: An Electroencephalogram Study

Authors: Anchana Julsiri, Seree Chadcham

Abstract:

The present study determined the effects of listening to pleasant Thai classical music on increasing working memory in elderly. Thai classical music without lyrics that made participants feel fun and aroused was used in the experiment for 3.19-5.40 minutes. The accuracy scores of Counting Span Task (CST), upper alpha ERD%, and theta ERS% were used to assess working memory of participants both before and after listening to pleasant Thai classical music. The results showed that the accuracy scores of CST and upper alpha ERD% in the frontal area of participants after listening to Thai classical music were significantly higher than before listening to Thai classical music (p < .05). Theta ERS% in the fronto-parietal network of participants after listening to Thai classical music was significantly lower than before listening to Thai classical music (p < .05).

Keywords: brain wave, elderly, pleasant Thai classical music, working memory

Procedia PDF Downloads 459
3504 The Sequential Estimation of the Seismoacoustic Source Energy in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

The practical efficient approach is suggested for estimation of the seismoacoustic sources energy in C-OTDR monitoring systems. This approach represents the sequential plan for confidence estimation both the seismoacoustic sources energy, as well the absorption coefficient of the soil. The sequential plan delivers the non-asymptotic guaranteed accuracy of obtained estimates in the form of non-asymptotic confidence regions with prescribed sizes. These confidence regions are valid for a finite sample size when the distributions of the observations are unknown. Thus, suggested estimates are non-asymptotic and nonparametric, and also these estimates guarantee the prescribed estimation accuracy in the form of the prior prescribed size of confidence regions, and prescribed confidence coefficient value.

Keywords: nonparametric estimation, sequential confidence estimation, multichannel monitoring systems, C-OTDR-system, non-lineary regression

Procedia PDF Downloads 356
3503 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 323
3502 Research on Reservoir Lithology Prediction Based on Residual Neural Network and Squeeze-and- Excitation Neural Network

Authors: Li Kewen, Su Zhaoxin, Wang Xingmou, Zhu Jian Bing

Abstract:

Conventional reservoir prediction methods ar not sufficient to explore the implicit relation between seismic attributes, and thus data utilization is low. In order to improve the predictive classification accuracy of reservoir lithology, this paper proposes a deep learning lithology prediction method based on ResNet (Residual Neural Network) and SENet (Squeeze-and-Excitation Neural Network). The neural network model is built and trained by using seismic attribute data and lithology data of Shengli oilfield, and the nonlinear mapping relationship between seismic attribute and lithology marker is established. The experimental results show that this method can significantly improve the classification effect of reservoir lithology, and the classification accuracy is close to 70%. This study can effectively predict the lithology of undrilled area and provide support for exploration and development.

Keywords: convolutional neural network, lithology, prediction of reservoir, seismic attributes

Procedia PDF Downloads 177