Search results for: regression coefficients
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3939

Search results for: regression coefficients

3579 Glucose Monitoring System Using Machine Learning Algorithms

Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe

Abstract:

The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.

Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning

Procedia PDF Downloads 187
3578 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.

Keywords: maritime transport, economy, GDP, regression, port

Procedia PDF Downloads 143
3577 The Effect of Accounting Conservatism on Cost of Capital: A Quantile Regression Approach for MENA Countries

Authors: Maha Zouaoui Khalifa, Hakim Ben Othman, Hussaney Khaled

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 345
3576 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique

Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani

Abstract:

Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.

Keywords: regression, machine learning, scan radiation, robot

Procedia PDF Downloads 67
3575 Wheeled Robot Stable Braking Process under Asymmetric Traction Coefficients

Authors: Boguslaw Schreyer

Abstract:

During the wheeled robot’s braking process, the extra dynamic vertical forces act on all wheels: left, right, front or rear. Those forces are directed downward on the front wheels while directed upward on the rear wheels. In order to maximize the deceleration, therefore, minimize the braking time and braking distance, we need to calculate a correct torque distribution: the front braking torque should be increased, and rear torque should be decreased. At the same time, we need to provide better transversal stability. In a simple case of all adhesion coefficients being the same under all wheels, the torque distribution may secure the optimal (maximal) control of the robot braking process, securing the minimum braking distance and a minimum braking time. At the same time, the transversal stability is relatively good. At any time, we control the transversal acceleration. In the case of the transversal movement, we stop the braking process and re-apply braking torque after a defined period of time. If we correctly calculate the value of the torques, we may secure the traction coefficient under the front and rear wheels close to its maximum. Also, in order to provide an optimum braking control, we need to calculate the timing of the braking torque application and the timing of its release. The braking torques should be released shortly after the wheels passed a maximum traction coefficient (while a wheels’ slip increases) and applied again after the wheels pass a maximum of traction coefficient (while the slip decreases). The correct braking torque distribution secures the front and rear wheels, passing this maximum at the same time. It guarantees an optimum deceleration control, therefore, minimum braking time. In order to calculate a correct torque distribution, a control unit should receive the input signals of a rear torque value (which changes independently), the robot’s deceleration, and values of the vertical front and rear forces. In order to calculate the timing of torque application and torque release, more signals are needed: speed of the robot: angular speed, and angular deceleration of the wheels. In case of different adhesion coefficients under the left and right wheels, but the same under each pair of wheels- the same under right wheels and the same under left wheels, the Select-Low (SL) and select high (SH) methods are applied. The SL method is suggested if transversal stability is more important than braking efficiency. Often in the case of the robot, more important is braking efficiency; therefore, the SH method is applied with some control of the transversal stability. In the case that all adhesion coefficients are different under all wheels, the front-rear torque distribution is maintained as in all previous cases. However, the timing of the braking torque application and release is controlled by the rear wheels’ lowest adhesion coefficient. The Lagrange equations have been used to describe robot dynamics. Matlab has been used in order to simulate the process of wheeled robot braking, and in conclusion, the braking methods have been selected.

Keywords: wheeled robots, braking, traction coefficient, asymmetric

Procedia PDF Downloads 154
3574 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products

Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic

Abstract:

The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.

Keywords: chemometrics, regression analysis, kombucha, quality control

Procedia PDF Downloads 131
3573 Definition of Aerodynamic Coefficients for Microgravity Unmanned Aerial System

Authors: Gamaliel Salazar, Adriana Chazaro, Oscar Madrigal

Abstract:

The evolution of Unmanned Aerial Systems (UAS) has made it possible to develop new vehicles capable to perform microgravity experiments which due its cost and complexity were beyond the reach for many institutions. In this study, the aerodynamic behavior of an UAS is studied through its deceleration stage after an initial free fall phase (where the microgravity effect is generated) using Computational Fluid Dynamics (CFD). Due to the fact that the payload would be analyzed under a microgravity environment and the nature of the payload itself, the speed of the UAS must be reduced in a smoothly way. Moreover, the terminal speed of the vehicle should be low enough to preserve the integrity of the payload and vehicle during the landing stage. The UAS model is made by a study pod, control surfaces with fixed and mobile sections, landing gear and two semicircular wing sections. The speed of the vehicle is decreased by increasing the angle of attack (AoA) of each wing section from 2° (where the airfoil S1091 has its greatest aerodynamic efficiency) to 80°, creating a circular wing geometry. Drag coefficients (Cd) and forces (Fd) are obtained employing CFD analysis. A simplified 3D model of the vehicle is analyzed using Ansys Workbench 16. The distance between the object of study and the walls of the control volume is eight times the length of the vehicle. The domain is discretized using an unstructured mesh based on tetrahedral elements. The refinement of the mesh is made by defining an element size of 0.004 m in the wing and control surfaces in order to figure out the fluid behavior in the most important zones, as well as accurate approximations of the Cd. The turbulent model k-epsilon is selected to solve the governing equations of the fluids while a couple of monitors are placed in both wing and all-body vehicle to visualize the variation of the coefficients along the simulation process. Employing a statistical approximation response surface methodology the case of study is parametrized considering the AoA of the wing as the input parameter and Cd and Fd as output parameters. Based on a Central Composite Design (CCD), the Design Points (DP) are generated so the Cd and Fd for each DP could be estimated. Applying a 2nd degree polynomial approximation the drag coefficients for every AoA were determined. Using this values, the terminal speed at each position is calculated considering a specific Cd. Additionally, the distance required to reach the terminal velocity at each AoA is calculated, so the minimum distance for the entire deceleration stage without comprising the payload could be determine. The Cd max of the vehicle is 1.18, so its maximum drag will be almost like the drag generated by a parachute. This guarantees that aerodynamically the vehicle can be braked, so it could be utilized for several missions allowing repeatability of microgravity experiments.

Keywords: microgravity effect, response surface, terminal speed, unmanned system

Procedia PDF Downloads 162
3572 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis

Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache

Abstract:

This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.

Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting

Procedia PDF Downloads 30
3571 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 31
3570 The Impact of Unconditional and Conditional Conservatism on Cost of Equity Capital: A Quantile Regression Approach for MENA Countries

Authors: Khalifa Maha, Ben Othman Hakim, Khaled Hussainey

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 351
3569 The Comparison and Optimization of the Analytic Method for Canthaxanthin, Food Colorants

Authors: Hee-Jae Suh, Kyung-Su Kim, Min-Ji Kim, Yeon-Seong Jeong, Ok-Hwan Lee, Jae-Wook Shin, Hyang-Sook Chun, Chan Lee

Abstract:

Canthaxanthin is keto-carotenoid produced from beta-carotene and it has been approved to be used in many countries as a food coloring agent. Canthaxanthin has been analyzed using High Performance Liquid Chromatography (HPLC) system with various ways of pretreatment methods. Four official methods for verification of canthaxanthin at FSA (UK), AOAC (US), EFSA (EU) and MHLW (Japan) were compared to improve its analytical and the pretreatment method. The Linearity, the limit of detection (LOD), the limit of quantification (LOQ), the accuracy, the precision and the recovery ratio were determined from each method with modification in pretreatment method. All HPLC methods exhibited correlation coefficients of calibration curves for canthaxanthin as 0.9999. The analysis methods from FSA, AOAC, and MLHW showed the LOD of 0.395 ppm, 0.105 ppm, and 0.084 ppm, and the LOQ of 1.196 ppm, 0.318 ppm, 0.254 ppm, respectively. Among tested methods, HPLC method of MHLW with modification in pretreatments was finally selected for the analysis of canthaxanthin in lab, because it exhibited the resolution factor of 4.0 and the selectivity of 1.30. This analysis method showed a correlation coefficients value of 0.9999 and the lowest LOD and LOQ. Furthermore, the precision ratio was lower than 1 and the accuracy was almost 100%. The method presented the recovery ratio of 90-110% with modification in pretreatment method. The cross-validation of coefficient variation was 5 or less among tested three institutions in Korea.

Keywords: analytic method, canthaxanthin, food colorants, pretreatment method

Procedia PDF Downloads 671
3568 Organ Dose Calculator for Fetus Undergoing Computed Tomography

Authors: Choonsik Lee, Les Folio

Abstract:

Pregnant patients may undergo CT in emergencies unrelated with pregnancy, and potential risk to the developing fetus is of concern. It is critical to accurately estimate fetal organ doses in CT scans. We developed a fetal organ dose calculation tool using pregnancy-specific computational phantoms combined with Monte Carlo radiation transport techniques. We adopted a series of pregnancy computational phantoms developed at the University of Florida at the gestational ages of 8, 10, 15, 20, 25, 30, 35, and 38 weeks (Maynard et al. 2011). More than 30 organs and tissues and 20 skeletal sites are defined in each fetus model. We calculated fetal organ dose-normalized by CTDIvol to derive organ dose conversion coefficients (mGy/mGy) for the eight fetuses for consequential slice locations ranging from the top to the bottom of the pregnancy phantoms with 1 cm slice thickness. Organ dose from helical scans was approximated by the summation of doses from multiple axial slices included in the given scan range of interest. We then compared dose conversion coefficients for major fetal organs in the abdominal-pelvis CT scan of pregnancy phantoms with the uterine dose of a non-pregnant adult female computational phantom. A comprehensive library of organ conversion coefficients was established for the eight developing fetuses undergoing CT. They were implemented into an in-house graphical user interface-based computer program for convenient estimation of fetal organ doses by inputting CT technical parameters as well as the age of the fetus. We found that the esophagus received the least dose, whereas the kidneys received the greatest dose in all fetuses in AP scans of the pregnancy phantoms. We also found that when the uterine dose of a non-pregnant adult female phantom is used as a surrogate for fetal organ doses, root-mean-square-error ranged from 0.08 mGy (8 weeks) to 0.38 mGy (38 weeks). The uterine dose was up to 1.7-fold greater than the esophagus dose of the 38-week fetus model. The calculation tool should be useful in cases requiring fetal organ dose in emergency CT scans as well as patient dose monitoring.

Keywords: computed tomography, fetal dose, pregnant women, radiation dose

Procedia PDF Downloads 124
3567 Approach to Formulate Intuitionistic Fuzzy Regression Models

Authors: Liang-Hsuan Chen, Sheng-Shing Nien

Abstract:

This study aims to develop approaches to formulate intuitionistic fuzzy regression (IFR) models for many decision-making applications in the fuzzy environments using intuitionistic fuzzy observations. Intuitionistic fuzzy numbers (IFNs) are used to characterize the fuzzy input and output variables in the IFR formulation processes. A mathematical programming problem (MPP) is built up to optimally determine the IFR parameters. Each parameter in the MPP is defined as a couple of alternative numerical variables with opposite signs, and an intuitionistic fuzzy error term is added to the MPP to characterize the uncertainty of the model. The IFR model is formulated based on the distance measure to minimize the total distance errors between estimated and observed intuitionistic fuzzy responses in the MPP resolution processes. The proposed approaches are simple/efficient in the formulation/resolution processes, in which the sign of parameters can be determined so that the problem to predetermine the sign of parameters is avoided. Furthermore, the proposed approach has the advantage that the spread of the predicted IFN response will not be over-increased, since the parameters in the established IFR model are crisp. The performance of the obtained models is evaluated and compared with the existing approaches.

Keywords: fuzzy sets, intuitionistic fuzzy number, intuitionistic fuzzy regression, mathematical programming method

Procedia PDF Downloads 126
3566 A Preliminary Study of the Subcontractor Evaluation System for the International Construction Market

Authors: Hochan Seok, Woosik Jang, Seung-Heon Han

Abstract:

The stagnant global construction market has intensified competition since 2008 among firms that aim to win overseas contracts. Against this backdrop, subcontractor selection is identified as one of the most critical success factors in overseas construction project. However, it is difficult to select qualified subcontractors due to the lack of evaluation standards and reliability. This study aims to identify the problems associated with existing subcontractor evaluations using a correlations analysis and a multiple regression analysis with pre-qualification and performance evaluation of 121 firms in six countries.

Keywords: subcontractor evaluation system, pre-qualification, performance evaluation, correlation analysis, multiple regression analysis

Procedia PDF Downloads 353
3565 Liquid Chromatography Microfluidics for Detection and Quantification of Urine Albumin Using Linear Regression Method

Authors: Patricia B. Cruz, Catrina Jean G. Valenzuela, Analyn N. Yumang

Abstract:

Nearly a hundred per million of the Filipino population is diagnosed with Chronic Kidney Disease (CKD). The early stage of CKD has no symptoms and can only be discovered once the patient undergoes urinalysis. Over the years, different methods were discovered and used for the quantification of the urinary albumin such as the immunochemical assays where most of these methods require large machinery that has a high cost in maintenance and resources, and a dipstick test which is yet to be proven and is still debated as a reliable method in detecting early stages of microalbuminuria. This research study involves the use of the liquid chromatography concept in microfluidic instruments with biosensor as a means of separation and detection respectively, and linear regression to quantify human urinary albumin. The researchers’ main objective was to create a miniature system that quantifies and detect patients’ urinary albumin while reducing the amount of volume used per five test samples. For this study, 30 urine samples of unknown albumin concentrations were tested using VITROS Analyzer and the microfluidic system for comparison. Based on the data shared by both methods, the actual vs. predicted regression were able to create a positive linear relationship with an R2 of 0.9995 and a linear equation of y = 1.09x + 0.07, indicating that the predicted values and actual values are approximately equal. Furthermore, the microfluidic instrument uses 75% less in total volume – sample and reagents combined, compared to the VITROS Analyzer per five test samples.

Keywords: Chronic Kidney Disease, Linear Regression, Microfluidics, Urinary Albumin

Procedia PDF Downloads 124
3564 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 290
3563 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 298
3562 On Improving Breast Cancer Prediction Using GRNN-CP

Authors: Kefaya Qaddoum

Abstract:

The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.

Keywords: neural network, conformal prediction, cancer classification, regression

Procedia PDF Downloads 271
3561 Bayesian Reliability of Weibull Regression with Type-I Censored Data

Authors: Al Omari Moahmmed Ahmed

Abstract:

In the Bayesian, we developed an approach by using non-informative prior with covariate and obtained by using Gauss quadrature method to estimate the parameters of the covariate and reliability function of the Weibull regression distribution with Type-I censored data. The maximum likelihood seen that the estimators obtained are not available in closed forms, although they can be solved it by using Newton-Raphson methods. The comparison criteria are the MSE and the performance of these estimates are assessed using simulation considering various sample size, several specific values of shape parameter. The results show that Bayesian with non-informative prior is better than Maximum Likelihood Estimator.

Keywords: non-informative prior, Bayesian method, type-I censoring, Gauss quardature

Procedia PDF Downloads 490
3560 Walmart Sales Forecasting using Machine Learning in Python

Authors: Niyati Sharma, Om Anand, Sanjeev Kumar Prasad

Abstract:

Assuming future sale value for any of the organizations is one of the major essential characteristics of tactical development. Walmart Sales Forecasting is the finest illustration to work with as a beginner; subsequently, it has the major retail data set. Walmart uses this sales estimate problem for hiring purposes also. We would like to analyzing how the internal and external effects of one of the largest companies in the US can walk out their Weekly Sales in the future. Demand forecasting is the planned prerequisite of products or services in the imminent on the basis of present and previous data and different stages of the market. Since all associations is facing the anonymous future and we do not distinguish in the future good demand. Hence, through exploring former statistics and recent market statistics, we envisage the forthcoming claim and building of individual goods, which are extra challenging in the near future. As a result of this, we are producing the required products in pursuance of the petition of the souk in advance. We will be using several machine learning models to test the exactness and then lastly, train the whole data by Using linear regression and fitting the training data into it. Accuracy is 8.88%. The extra trees regression model gives the best accuracy of 97.15%.

Keywords: random forest algorithm, linear regression algorithm, extra trees classifier, mean absolute error

Procedia PDF Downloads 136
3559 Using the Nerlovian Adjustment Model to Assess the Response of Farmers to Price and Other Related Factors: Evidence from Sierra Leone Rice Cultivation

Authors: Alhaji M. H. Conteh, Xiangbin Yan, Alfred V. Gborie

Abstract:

The goal of this study was to increase the awareness of the description and assessments of rice acreage response and to offer mechanisms for agricultural policy scrutiny. The Ordinary Least Square (OLS) technique was utilized to determine the coefficients of acreage response models for the rice varieties. The magnitudes of the coefficients (λ) of both the ROK lagged and NERICA lagged acreages were found positive and highly significant, which indicates that farmers’ adjustment rate was very low. Regarding lagged actual price for both the ROK and NERICE rice varieties, the short-run price elasticities were lower than long-run, which is suggesting a long-term adjustment of the acreage, is under the crop. However, the apparent recommendations for policy transformation are to open farm gate prices and to decrease government’s involvement in agricultural sector especially in the acquisition of agricultural inputs. Impending research have to be centred on how this might be better realized. Necessary conditions should be made available to the private sector by means of minimizing price volatility. In accordance with structural reforms, it is necessary to convey output prices to farmers with minimum distortion. There is a need to eradicate price subsidies and control, which generate distortion in the market in addition to huge financial costs.

Keywords: acreage response, rate of adjustment, rice varieties, Sierra Leone

Procedia PDF Downloads 314
3558 Statistical Model of Water Quality in Estero El Macho, Machala-El Oro

Authors: Rafael Zhindon Almeida

Abstract:

Surface water quality is an important concern for the evaluation and prediction of water quality conditions. The objective of this study is to develop a statistical model that can accurately predict the water quality of the El Macho estuary in the city of Machala, El Oro province. The methodology employed in this study is of a basic type that involves a thorough search for theoretical foundations to improve the understanding of statistical modeling for water quality analysis. The research design is correlational, using a multivariate statistical model involving multiple linear regression and principal component analysis. The results indicate that water quality parameters such as fecal coliforms, biochemical oxygen demand, chemical oxygen demand, iron and dissolved oxygen exceed the allowable limits. The water of the El Macho estuary is determined to be below the required water quality criteria. The multiple linear regression model, based on chemical oxygen demand and total dissolved solids, explains 99.9% of the variance of the dependent variable. In addition, principal component analysis shows that the model has an explanatory power of 86.242%. The study successfully developed a statistical model to evaluate the water quality of the El Macho estuary. The estuary did not meet the water quality criteria, with several parameters exceeding the allowable limits. The multiple linear regression model and principal component analysis provide valuable information on the relationship between the various water quality parameters. The findings of the study emphasize the need for immediate action to improve the water quality of the El Macho estuary to ensure the preservation and protection of this valuable natural resource.

Keywords: statistical modeling, water quality, multiple linear regression, principal components, statistical models

Procedia PDF Downloads 78
3557 Analysis of Ferroresonant Overvoltages in Cable-fed Transformers

Authors: George Eduful, Ebenezer A. Jackson, Kingsford A. Atanga

Abstract:

This paper investigates the impacts of cable length and capacity of transformer on ferroresonant overvoltage in cable-fed transformers. The study was conducted by simulation using the EMTP RV. Results show that ferroresonance can cause dangerous overvoltages ranging from 2 to 5 per unit. These overvoltages impose stress on insulations of transformers and cables and subsequently result in system failures. Undertaking Basic Multiple Regression Analysis (BMR) on the results obtained, a statistical model was obtained in terms of cable length and transformer capacity. The model is useful for ferroresonant prediction and control in cable-fed transformers.

Keywords: ferroresonance, cable-fed transformers, EMTP RV, regression analysis

Procedia PDF Downloads 520
3556 Analysis of a Self-Acting Air Journal Bearing: Effect of Dynamic Deformation of Bump Foil

Authors: H. Bensouilah, H. Boucherit, M. Lahmar

Abstract:

A theoretical investigation on the effects of both steady-state and dynamic deformations of the foils on the dynamic performance characteristics of a self-acting air foil journal bearing operating under small harmonic vibrations is proposed. To take into account the dynamic deformations of foils, the perturbation method is used for determining the gas-film stiffness and damping coefficients for given values of excitation frequency, compressibility number, and compliance factor of the bump foil. The nonlinear stationary Reynolds’ equation is solved by means of the Galerkins’ finite element formulation while the finite differences method are used to solve the first order complex dynamic equations resulting from the perturbation of the nonlinear transient compressible Reynolds’ equation. The stiffness of a bump is uniformly distributed throughout the bearing surface (generation I bearing). It was found that the dynamic properties of the compliant finite length journal bearing are significantly affected by the compliance of foils especially when the dynamic deformation of foils is considered in addition to the static one by applying the principle of superposition.

Keywords: elasto-aerodynamic lubrication, air foil bearing, steady-state deformation, dynamic deformation, stiffness and damping coefficients, perturbation method, fluid-structure interaction, Galerk infinite element method, finite difference method

Procedia PDF Downloads 384
3555 Marginalized Two-Part Joint Models for Generalized Gamma Family of Distributions

Authors: Mohadeseh Shojaei Shahrokhabadi, Ding-Geng (Din) Chen

Abstract:

Positive continuous outcomes with a substantial number of zero values and incomplete longitudinal follow-up are quite common in medical cost data. To jointly model semi-continuous longitudinal cost data and survival data and to provide marginalized covariate effect estimates, a marginalized two-part joint model (MTJM) has been developed for outcome variables with lognormal distributions. In this paper, we propose MTJM models for outcome variables from a generalized gamma (GG) family of distributions. The GG distribution constitutes a general family that includes approximately all of the most frequently used distributions like the Gamma, Exponential, Weibull, and Log Normal. In the proposed MTJM-GG model, the conditional mean from a conventional two-part model with a three-parameter GG distribution is parameterized to provide the marginal interpretation for regression coefficients. In addition, MTJM-gamma and MTJM-Weibull are developed as special cases of MTJM-GG. To illustrate the applicability of the MTJM-GG, we applied the model to a set of real electronic health record data recently collected in Iran, and we provided SAS code for application. The simulation results showed that when the outcome distribution is unknown or misspecified, which is usually the case in real data sets, the MTJM-GG consistently outperforms other models. The GG family of distribution facilitates estimating a model with improved fit over the MTJM-gamma, standard Weibull, or Log-Normal distributions.

Keywords: marginalized two-part model, zero-inflated, right-skewed, semi-continuous, generalized gamma

Procedia PDF Downloads 164
3554 Research on the Two-Way Sound Absorption Performance of Multilayer Material

Authors: Yang Song, Xiaojun Qiu

Abstract:

Multilayer materials are applied to much acoustics area. Multilayer porous materials are dominant in room absorber. Multilayer viscoelastic materials are the basic parts in underwater absorption coating. In most cases, the one-way sound absorption performance of multilayer material is concentrated according to the sound source site. But the two-way sound absorption performance is also necessary to be known in some special cases which sound is produced in both sides of the material and the both sides especially might contact with different media. In this article, this kind of case was research. The multilayer material was composed of viscoelastic layer and steel plate and the porous layer. The two sides of multilayer material contact with water and air, respectively. A theory model was given to describe the sound propagation and impedance in multilayer absorption material. The two-way sound absorption properties of several multilayer materials were calculated whose two sides all contacted with different media. The calculated results showed that the difference of two-way sound absorption coefficients is obvious. The frequency, the relation of layers thickness and parameters of multilayer materials all have an influence on the two-way sound absorption coefficients. But the degrees of influence are varied. All these simulation results were analyzed in the article. It was obtained that two-way sound absorption at different frequencies can be promoted by optimizing the configuration parameters. This work will improve the performance of underwater sound absorption coating which can absorb incident sound from the water and reduce the noise radiation from inside space.

Keywords: different media, multilayer material, sound absorption coating, two-way sound absorption

Procedia PDF Downloads 526
3553 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 132
3552 Statistical Model to Examine the Impact of the Inflation Rate and Real Interest Rate on the Bahrain Economy

Authors: Ghada Abo-Zaid

Abstract:

Introduction: Oil is one of the most income source in Bahrain. Low oil price influence on the economy growth and the investment rate in Bahrain. For example, the economic growth was 3.7% in 2012, and it reduced to 2.9% in 2015. Investment rate was 9.8% in 2012, and it is reduced to be 5.9% and -12.1% in 2014 and 2015, respectively. The inflation rate is increased to the peak point in 2013 with 3.3 %. Objectives: The objectives here are to build statistical models to examine the effect of the interest rate inflation rate on the growth economy in Bahrain from 2000 to 2018. Methods: This study based on 18 years, and the multiple regression model is used for the analysis. All of the missing data are omitted from the analysis. Results: Regression model is used to examine the association between the Growth national product (GNP), the inflation rate, and real interest rate. We found that (i) Increase the real interest rate decrease the GNP. (ii) Increase the inflation rate does not effect on the growth economy in Bahrain since the average of the inflation rate was almost 2%, and this is considered as a low percentage. Conclusion: There is a positive impact of the real interest rate on the GNP in Bahrain. While the inflation rate does not show any negative influence on the GNP as the inflation rate was not large enough to effect negatively on the economy growth rate in Bahrain.

Keywords: growth national product, egypt, regression model, interest rate

Procedia PDF Downloads 147
3551 Support Vector Regression with Weighted Least Absolute Deviations

Authors: Kang-Mo Jung

Abstract:

Least squares support vector machine (LS-SVM) is a penalized regression which considers both fitting and generalization ability of a model. However, the squared loss function is very sensitive to even single outlier. We proposed a weighted absolute deviation loss function for the robustness of the estimates in least absolute deviation support vector machine. The proposed estimates can be obtained by a quadratic programming algorithm. Numerical experiments on simulated datasets show that the proposed algorithm is competitive in view of robustness to outliers.

Keywords: least absolute deviation, quadratic programming, robustness, support vector machine, weight

Procedia PDF Downloads 510
3550 The Prediction of Effective Equation on Drivers' Behavioral Characteristics of Lane Changing

Authors: Khashayar Kazemzadeh, Mohammad Hanif Dasoomi

Abstract:

According to the increasing volume of traffic, lane changing plays a crucial role in traffic flow. Lane changing in traffic depends on several factors including road geometrical design, speed, drivers’ behavioral characteristics, etc. A great deal of research has been carried out regarding these fields. Despite of the other significant factors, the drivers’ behavioral characteristics of lane changing has been emphasized in this paper. This paper has predicted the effective equation based on personal characteristics of lane changing by regression models.

Keywords: effective equation, lane changing, drivers’ behavioral characteristics, regression models

Procedia PDF Downloads 433