Search results for: poisson regression model
18616 Artificial Neural Network and Statistical Method
Authors: Tomas Berhanu Bekele
Abstract:
Traffic congestion is one of the main problems related to transportation in developed as well as developing countries. Traffic control systems are based on the idea of avoiding traffic instabilities and homogenizing traffic flow in such a way that the risk of accidents is minimized and traffic flow is maximized. Lately, Intelligent Transport Systems (ITS) has become an important area of research to solve such road traffic-related issues for making smart decisions. It links people, roads and vehicles together using communication technologies to increase safety and mobility. Moreover, accurate prediction of road traffic is important to manage traffic congestion. The aim of this study is to develop an ANN model for the prediction of traffic flow and to compare the ANN model with the linear regression model of traffic flow predictions. Data extraction was carried out in intervals of 15 minutes from the video player. Video of mixed traffic flow was taken and then counted during office work in order to determine the traffic volume. Vehicles were classified into six categories, namely Car, Motorcycle, Minibus, mid-bus, Bus, and Truck vehicles. The average time taken by each vehicle type to travel the trap length was measured by time displayed on a video screen.Keywords: intelligent transport system (ITS), traffic flow prediction, artificial neural network (ANN), linear regression
Procedia PDF Downloads 6718615 In and Out-Of-Sample Performance of Non Simmetric Models in International Price Differential Forecasting in a Commodity Country Framework
Authors: Nicola Rubino
Abstract:
This paper presents an analysis of a group of commodity exporting countries' nominal exchange rate movements in relationship to the US dollar. Using a series of Unrestricted Self-exciting Threshold Autoregressive models (SETAR), we model and evaluate sixteen national CPI price differentials relative to the US dollar CPI. Out-of-sample forecast accuracy is evaluated through calculation of mean absolute error measures on the basis of two-hundred and fifty-three months rolling window forecasts and extended to three additional models, namely a logistic smooth transition regression (LSTAR), an additive non linear autoregressive model (AAR) and a simple linear Neural Network model (NNET). Our preliminary results confirm presence of some form of TAR non linearity in the majority of the countries analyzed, with a relatively higher goodness of fit, with respect to the linear AR(1) benchmark, in five countries out of sixteen considered. Although no model appears to statistically prevail over the other, our final out-of-sample forecast exercise shows that SETAR models tend to have quite poor relative forecasting performance, especially when compared to alternative non-linear specifications. Finally, by analyzing the implied half-lives of the > coefficients, our results confirms the presence, in the spirit of arbitrage band adjustment, of band convergence with an inner unit root behaviour in five of the sixteen countries analyzed.Keywords: transition regression model, real exchange rate, nonlinearities, price differentials, PPP, commodity points
Procedia PDF Downloads 27818614 Adaptive Neuro Fuzzy Inference System Model Based on Support Vector Regression for Stock Time Series Forecasting
Authors: Anita Setianingrum, Oki S. Jaya, Zuherman Rustam
Abstract:
Forecasting stock price is a challenging task due to the complex time series of the data. The complexity arises from many variables that affect the stock market. Many time series models have been proposed before, but those previous models still have some problems: 1) put the subjectivity of choosing the technical indicators, and 2) rely upon some assumptions about the variables, so it is limited to be applied to all datasets. Therefore, this paper studied a novel Adaptive Neuro-Fuzzy Inference System (ANFIS) time series model based on Support Vector Regression (SVR) for forecasting the stock market. In order to evaluate the performance of proposed models, stock market transaction data of TAIEX and HIS from January to December 2015 is collected as experimental datasets. As a result, the method has outperformed its counterparts in terms of accuracy.Keywords: ANFIS, fuzzy time series, stock forecasting, SVR
Procedia PDF Downloads 24718613 Electricity Load Modeling: An Application to Italian Market
Authors: Giovanni Masala, Stefania Marica
Abstract:
Forecasting electricity load plays a crucial role regards decision making and planning for economical purposes. Besides, in the light of the recent privatization and deregulation of the power industry, the forecasting of future electricity load turned out to be a very challenging problem. Empirical data about electricity load highlights a clear seasonal behavior (higher load during the winter season), which is partly due to climatic effects. We also emphasize the presence of load periodicity at a weekly basis (electricity load is usually lower on weekends or holidays) and at daily basis (electricity load is clearly influenced by the hour). Finally, a long-term trend may depend on the general economic situation (for example, industrial production affects electricity load). All these features must be captured by the model. The purpose of this paper is then to build an hourly electricity load model. The deterministic component of the model requires non-linear regression and Fourier series while we will investigate the stochastic component through econometrical tools. The calibration of the parameters’ model will be performed by using data coming from the Italian market in a 6 year period (2007- 2012). Then, we will perform a Monte Carlo simulation in order to compare the simulated data respect to the real data (both in-sample and out-of-sample inspection). The reliability of the model will be deduced thanks to standard tests which highlight a good fitting of the simulated values.Keywords: ARMA-GARCH process, electricity load, fitting tests, Fourier series, Monte Carlo simulation, non-linear regression
Procedia PDF Downloads 39518612 Life in Bequia in the Era of Climate Change: Societal Perception of Adaptation and Vulnerability
Authors: Sherry Ann Ganase, Sandra Sookram
Abstract:
This study examines adaptation measures and factors that influence adaptation decisions in Bequia by using multiple linear regression and a structural equation model. Using survey data, the results suggest that households are knowledgeable and concerned about climate change but lack knowledge about the measures needed to adapt. The findings from the SEM suggest that a positive relationship exist between vulnerability and adaptation, vulnerability and perception, along with a negative relationship between perception and adaptation. This suggests that being aware of the terms associated with climate change and knowledge about climate change is insufficient for implementing adaptation measures; instead the risk and importance placed on climate change, vulnerability experienced with household flooding, drainage and expected threat of future sea level are the main factors that influence the adaptation decision. The results obtained in this study are beneficial to all as adaptation requires a collective effort by stakeholders.Keywords: adaptation, Bequia, multiple linear regression, structural equation model
Procedia PDF Downloads 46318611 Growth Curves Genetic Analysis of Native South Caspian Sea Poultry Using Bayesian Statistics
Authors: Jamal Fayazi, Farhad Anoosheh, Mohammad R. Ghorbani, Ali R. Paydar
Abstract:
In this study, to determine the best non-linear regression model describing the growth curve of native poultry, 9657 chicks of generations 18, 19, and 20 raised in Mazandaran breeding center were used. Fowls and roosters of this center distributed in south of Caspian Sea region. To estimate the genetic variability of none linear regression parameter of growth traits, a Gibbs sampling of Bayesian analysis was used. The average body weight traits in the first day (BW1), eighth week (BW8) and twelfth week (BW12) were respectively estimated as 36.05, 763.03, and 1194.98 grams. Based on the coefficient of determination, mean squares of error and Akaike information criteria, Gompertz model was selected as the best growth descriptive function. In Gompertz model, the estimated values for the parameters of maturity weight (A), integration constant (B) and maturity rate (K) were estimated to be 1734.4, 3.986, and 0.282, respectively. The direct heritability of BW1, BW8 and BW12 were respectively reported to be as 0.378, 0.3709, 0.316, 0.389, 0.43, 0.09 and 0.07. With regard to estimated parameters, the results of this study indicated that there is a possibility to improve some property of growth curve using appropriate selection programs.Keywords: direct heritability, Gompertz, growth traits, maturity weight, native poultry
Procedia PDF Downloads 26518610 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques
Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas
Abstract:
The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining
Procedia PDF Downloads 12118609 Predicting Bridge Pier Scour Depth with SVM
Authors: Arun Goel
Abstract:
Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)
Procedia PDF Downloads 45118608 Development of a Turbulent Boundary Layer Wall-pressure Fluctuations Power Spectrum Model Using a Stepwise Regression Algorithm
Authors: Zachary Huffman, Joana Rocha
Abstract:
Wall-pressure fluctuations induced by the turbulent boundary layer (TBL) developed over aircraft are a significant source of aircraft cabin noise. Since the power spectral density (PSD) of these pressure fluctuations is directly correlated with the amount of sound radiated into the cabin, the development of accurate empirical models that predict the PSD has been an important ongoing research topic. The sound emitted can be represented from the pressure fluctuations term in the Reynoldsaveraged Navier-Stokes equations (RANS). Therefore, early TBL empirical models (including those from Lowson, Robertson, Chase, and Howe) were primarily derived by simplifying and solving the RANS for pressure fluctuation and adding appropriate scales. Most subsequent models (including Goody, Efimtsov, Laganelli, Smol’yakov, and Rackl and Weston models) were derived by making modifications to these early models or by physical principles. Overall, these models have had varying levels of accuracy, but, in general, they are most accurate under the specific Reynolds and Mach numbers they were developed for, while being less accurate under other flow conditions. Despite this, recent research into the possibility of using alternative methods for deriving the models has been rather limited. More recent studies have demonstrated that an artificial neural network model was more accurate than traditional models and could be applied more generally, but the accuracy of other machine learning techniques has not been explored. In the current study, an original model is derived using a stepwise regression algorithm in the statistical programming language R, and TBL wall-pressure fluctuations PSD data gathered at the Carleton University wind tunnel. The theoretical advantage of a stepwise regression approach is that it will automatically filter out redundant or uncorrelated input variables (through the process of feature selection), and it is computationally faster than machine learning. The main disadvantage is the potential risk of overfitting. The accuracy of the developed model is assessed by comparing it to independently sourced datasets.Keywords: aircraft noise, machine learning, power spectral density models, regression models, turbulent boundary layer wall-pressure fluctuations
Procedia PDF Downloads 13518607 Satellite LiDAR-Based Digital Terrain Model Correction using Gaussian Process Regression
Authors: Keisuke Takahata, Hiroshi Suetsugu
Abstract:
Forest height is an important parameter for forest biomass estimation, and precise elevation data is essential for accurate forest height estimation. There are several globally or nationally available digital elevation models (DEMs) like SRTM and ASTER. However, its accuracy is reported to be low particularly in mountainous areas where there are closed canopy or steep slope. Recently, space-borne LiDAR, such as the Global Ecosystem Dynamics Investigation (GEDI), have started to provide sparse but accurate ground elevation and canopy height estimates. Several studies have reported the high degree of accuracy in their elevation products on their exact footprints, while it is not clear how this sparse information can be used for wider area. In this study, we developed a digital terrain model correction algorithm by spatially interpolating the difference between existing DEMs and GEDI elevation products by using Gaussian Process (GP) regression model. The result shows that our GP-based methodology can reduce the mean bias of the elevation data from 3.7m to 0.3m when we use airborne LiDAR-derived elevation information as ground truth. Our algorithm is also capable of quantifying the elevation data uncertainty, which is critical requirement for biomass inventory. Upcoming satellite-LiDAR missions, like MOLI (Multi-footprint Observation Lidar and Imager), are expected to contribute to the more accurate digital terrain model generation.Keywords: digital terrain model, satellite LiDAR, gaussian processes, uncertainty quantification
Procedia PDF Downloads 18318606 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models
Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales
Abstract:
The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.Keywords: concrete bridges, deterioration, Markov chains, probability matrix
Procedia PDF Downloads 33618605 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan
Abstract:
Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy
Procedia PDF Downloads 30918604 SVM-Based Modeling of Mass Transfer Potential of Multiple Plunging Jets
Authors: Surinder Deswal, Mahesh Pal
Abstract:
The paper investigates the potential of support vector machines based regression approach to model the mass transfer capacity of multiple plunging jets, both vertical (θ = 90°) and inclined (θ = 60°). The data set used in this study consists of four input parameters with a total of eighty eight cases. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 (root mean square error values of 0.0025 and 0.0020) were achieved by using polynomial and radial basis kernel functions based support vector regression respectively. Results suggest an improved performance by radial basis function in comparison to polynomial kernel based support vector machines. The estimated overall mass transfer coefficient, by both the kernel functions, is in good agreement with actual experimental values (within a scatter of ±15 %); thereby suggesting the utility of support vector machines based regression approach.Keywords: mass transfer, multiple plunging jets, support vector machines, ecological sciences
Procedia PDF Downloads 46418603 Extension-Torsion-Inflation Coupling in Compressible Magnetoelastomeric Tubes with Helical Magnetic Anisotropy
Authors: Darius Diogo Barreto, Ajeet Kumar, Sushma Santapuri
Abstract:
We present an axisymmetric variational formulation for coupled extension-torsion-inflation deformation in magnetoelastomeric thin tubes when both azimuthal and axial magnetic fields are applied. The tube's material is assumed to have a preferred magnetization direction which imparts helical magnetic anisotropy to the tube. We have also derived the expressions of the first derivative of free energy per unit tube's undeformed length with respect to various imposed strain parameters. On applying the thin tube limit, the two nonlinear ordinary differential equations to obtain the in-plane radial displacement and radial component of the Lagrangian magnetic field get converted into a set of three simple algebraic equations. This allows us to obtain simple analytical expressions in terms of the applied magnetic field, magnetization direction, and magnetoelastic constants, which tell us how these parameters can be tuned to generate positive/negative Poisson's effect in such tubes. We consider both torsionally constrained and torsionally relaxed stretching of the tube. The study can be useful in designing magnetoelastic tubular actuators.Keywords: nonlinear magnetoelasticity, extension-torsion coupling, negative Poisson's effect, helical anisotropy, thin tube
Procedia PDF Downloads 12018602 On Improving Breast Cancer Prediction Using GRNN-CP
Authors: Kefaya Qaddoum
Abstract:
The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.Keywords: neural network, conformal prediction, cancer classification, regression
Procedia PDF Downloads 29118601 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum
Authors: Abdulrahman Sumayli, Saad M. AlShahrani
Abstract:
For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectivelyKeywords: temperature, pressure variations, machine learning, oil treatment
Procedia PDF Downloads 6918600 Machine Vision System for Measuring the Quality of Bulk Sun-dried Organic Raisins
Authors: Navab Karimi, Tohid Alizadeh
Abstract:
An intelligent vision-based system was designed to measure the quality and purity of raisins. A machine vision setup was utilized to capture the images of bulk raisins in ranges of 5-50% mixed pure-impure berries. The textural features of bulk raisins were extracted using Grey-level Histograms, Co-occurrence Matrix, and Local Binary Pattern (a total of 108 features). Genetic Algorithm and neural network regression were used for selecting and ranking the best features (21 features). As a result, the GLCM features set was found to have the highest accuracy (92.4%) among the other sets. Followingly, multiple feature combinations of the previous stage were fed into the second regression (linear regression) to increase accuracy, wherein a combination of 16 features was found to be the optimum. Finally, a Support Vector Machine (SVM) classifier was used to differentiate the mixtures, producing the best efficiency and accuracy of 96.2% and 97.35%, respectively.Keywords: sun-dried organic raisin, genetic algorithm, feature extraction, ann regression, linear regression, support vector machine, south azerbaijan.
Procedia PDF Downloads 7318599 Bayesian Locally Approach for Spatial Modeling of Visceral Leishmaniasis Infection in Northern and Central Tunisia
Authors: Kais Ben-Ahmed, Mhamed Ali-El-Aroui
Abstract:
This paper develops a Local Generalized Linear Spatial Model (LGLSM) to describe the spatial variation of Visceral Leishmaniasis (VL) infection risk in northern and central Tunisia. The response from each region is a number of affected children less than five years of age recorded from 1996 through 2006 from Tunisian pediatric departments and treated as a poison county level data. The model includes climatic factors, namely averages of annual rainfall, extreme values of low temperatures in winter and high temperatures in summer to characterize the climate of each region according to each continentality index, the pluviometric quotient of Emberger (Q2) to characterize bioclimatic regions and component for residual extra-poison variation. The statistical results show the progressive increase in the number of affected children in regions with high continentality index and low mean yearly rainfull. On the other hand, an increase in pluviometric quotient of Emberger contributed to a significant increase in VL incidence rate. When compared with the original GLSM, Bayesian locally modeling is improvement and gives a better approximation of the Tunisian VL risk estimation. According to the Bayesian approach inference, we use vague priors for all parameters model and Markov Chain Monte Carlo method.Keywords: generalized linear spatial model, local model, extra-poisson variation, continentality index, visceral leishmaniasis, Tunisia
Procedia PDF Downloads 39718598 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection
Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew
Abstract:
The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.
Procedia PDF Downloads 4718597 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra
Authors: Bitewulign Mekonnen
Abstract:
Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network
Procedia PDF Downloads 9418596 Comparison of Applicability of Time Series Forecasting Models VAR, ARCH and ARMA in Management Science: Study Based on Empirical Analysis of Time Series Techniques
Authors: Muhammad Tariq, Hammad Tahir, Fawwad Mahmood Butt
Abstract:
Purpose: This study attempts to examine the best forecasting methodologies in the time series. The time series forecasting models such as VAR, ARCH and the ARMA are considered for the analysis. Methodology: The Bench Marks or the parameters such as Adjusted R square, F-stats, Durban Watson, and Direction of the roots have been critically and empirically analyzed. The empirical analysis consists of time series data of Consumer Price Index and Closing Stock Price. Findings: The results show that the VAR model performed better in comparison to other models. Both the reliability and significance of VAR model is highly appreciable. In contrary to it, the ARCH model showed very poor results for forecasting. However, the results of ARMA model appeared double standards i.e. the AR roots showed that model is stationary and that of MA roots showed that the model is invertible. Therefore, the forecasting would remain doubtful if it made on the bases of ARMA model. It has been concluded that VAR model provides best forecasting results. Practical Implications: This paper provides empirical evidences for the application of time series forecasting model. This paper therefore provides the base for the application of best time series forecasting model.Keywords: forecasting, time series, auto regression, ARCH, ARMA
Procedia PDF Downloads 34818595 Transportation Accidents Mortality Modeling in Thailand
Authors: W. Sriwattanapongse, S. Prasitwattanaseree, S. Wongtrangan
Abstract:
The transportation accidents mortality is a major problem that leads to loss of human lives, and economic. The objective was to identify patterns of statistical modeling for estimating mortality rates due to transportation accidents in Thailand by using data from 2000 to 2009. The data was taken from the death certificate, vital registration database. The number of deaths and mortality rates were computed classifying by gender, age, year and region. There were 114,790 cases of transportation accidents deaths. The highest average age-specific transport accident mortality rate is 3.11 per 100,000 per year in males, Southern region and the lowest average age-specific transport accident mortality rate is 1.79 per 100,000 per year in females, North-East region. Linear, poisson and negative binomial models were chosen for fitting statistical model. Among the models fitted, the best was chosen based on the analysis of deviance and AIC. The negative binomial model was clearly appropriate fitted.Keywords: transportation accidents, mortality, modeling, analysis of deviance
Procedia PDF Downloads 24418594 A Comparison of Methods for Estimating Dichotomous Treatment Effects: A Simulation Study
Authors: Jacqueline Y. Thompson, Sam Watson, Lee Middleton, Karla Hemming
Abstract:
Introduction: The odds ratio (estimated via logistic regression) is a well-established and common approach for estimating covariate-adjusted binary treatment effects when comparing a treatment and control group with dichotomous outcomes. Its popularity is primarily because of its stability and robustness to model misspecification. However, the situation is different for the relative risk and risk difference, which are arguably easier to interpret and better suited to specific designs such as non-inferiority studies. So far, there is no equivalent, widely acceptable approach to estimate an adjusted relative risk and risk difference when conducting clinical trials. This is partly due to the lack of a comprehensive evaluation of available candidate methods. Methods/Approach: A simulation study is designed to evaluate the performance of relevant candidate methods to estimate relative risks to represent conditional and marginal estimation approaches. We consider the log-binomial, generalised linear models (GLM) with iteratively weighted least-squares (IWLS) and model-based standard errors (SE); log-binomial GLM with convex optimisation and model-based SEs; log-binomial GLM with convex optimisation and permutation tests; modified-Poisson GLM IWLS and robust SEs; log-binomial generalised estimation equations (GEE) and robust SEs; marginal standardisation and delta method SEs; and marginal standardisation and permutation test SEs. Independent and identically distributed datasets are simulated from a randomised controlled trial to evaluate these candidate methods. Simulations are replicated 10000 times for each scenario across all possible combinations of sample sizes (200, 1000, and 5000), outcomes (10%, 50%, and 80%), and covariates (ranging from -0.05 to 0.7) representing weak, moderate or strong relationships. Treatment effects (ranging from 0, -0.5, 1; on the log-scale) will consider null (H0) and alternative (H1) hypotheses to evaluate coverage and power in realistic scenarios. Performance measures (bias, mean square error (MSE), relative efficiency, and convergence rates) are evaluated across scenarios covering a range of sample sizes, event rates, covariate prognostic strength, and model misspecifications. Potential Results, Relevance & Impact: There are several methods for estimating unadjusted and adjusted relative risks. However, it is unclear which method(s) is the most efficient, preserves type-I error rate, is robust to model misspecification, or is the most powerful when adjusting for non-prognostic and prognostic covariates. GEE estimations may be biased when the outcome distributions are not from marginal binary data. Also, it seems that marginal standardisation and convex optimisation may perform better than GLM IWLS log-binomial.Keywords: binary outcomes, statistical methods, clinical trials, simulation study
Procedia PDF Downloads 11518593 Loan Repayment Prediction Using Machine Learning: Model Development, Django Web Integration and Cloud Deployment
Authors: Seun Mayowa Sunday
Abstract:
Loan prediction is one of the most significant and recognised fields of research in the banking, insurance, and the financial security industries. Some prediction systems on the market include the construction of static software. However, due to the fact that static software only operates with strictly regulated rules, they cannot aid customers beyond these limitations. Application of many machine learning (ML) techniques are required for loan prediction. Four separate machine learning models, random forest (RF), decision tree (DT), k-nearest neighbour (KNN), and logistic regression, are used to create the loan prediction model. Using the anaconda navigator and the required machine learning (ML) libraries, models are created and evaluated using the appropriate measuring metrics. From the finding, the random forest performs with the highest accuracy of 80.17% which was later implemented into the Django framework. For real-time testing, the web application is deployed on the Alibabacloud which is among the top 4 biggest cloud computing provider. Hence, to the best of our knowledge, this research will serve as the first academic paper which combines the model development and the Django framework, with the deployment into the Alibaba cloud computing application.Keywords: k-nearest neighbor, random forest, logistic regression, decision tree, django, cloud computing, alibaba cloud
Procedia PDF Downloads 13618592 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach
Authors: Ekele Alih, Enejo Jalija
Abstract:
Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,Keywords: female genital mutilation, logistic regression, tau statistic, African society
Procedia PDF Downloads 26118591 Predictors of School Drop out among High School Students
Authors: Osman Zorbaz, Selen Demirtas-Zorbaz, Ozlem Ulas
Abstract:
The factors that cause adolescents to drop out school were several. One of the frameworks about school dropout focuses on the contextual factors around the adolescents whereas the other one focuses on individual factors. It can be said that both factors are important equally. In this study, both adolescent’s individual factors (anti-social behaviors, academic success) and contextual factors (parent academic involvement, parent academic support, number of siblings, living with parent) were examined in the term of school dropout. The study sample consisted of 346 high school students in the public schools in Ankara who continued their education in 2015-2016 academic year. One hundred eighty-five the students (53.5%) were girls and 161 (46.5%) were boys. In addition to this 118 of them were in ninth grade, 122 of them in tenth grade and 106 of them were in eleventh grade. Multiple regression and one-way ANOVA statistical methods were used. First, it was examined if the data meet the assumptions and conditions that are required for regression analysis. After controlling the assumptions, regression analysis was conducted. Parent academic involvement, parent academic support, number of siblings, anti-social behaviors, academic success variables were taken into the regression model and it was seen that parent academic involvement (t=-3.023, p < .01), anti-social behaviors (t=7.038, p < .001), and academic success (t=-3.718, p < .001) predicted school dropout whereas parent academic support (t=-1.403, p > .05) and number of siblings (t=-1.908, p > .05) didn’t. The model explained 30% of the variance (R=.557, R2=.300, F5,345=30.626, p < .001). In addition to this the variance, results showed there was no significant difference on high school students school dropout levels according to living with parents or not (F2;345=1.183, p > .05). Results discussed in the light of the literature and suggestion were made. As a result, academic involvement, academic success and anti-social behaviors will be considered as an important factors for preventing school drop-out.Keywords: adolescents, anti-social behavior, parent academic involvement, parent academic support, school dropout
Procedia PDF Downloads 28418590 Prediction of Marijuana Use among Iranian Early Youth: an Application of Integrative Model of Behavioral Prediction
Authors: Mehdi Mirzaei Alavijeh, Farzad Jalilian
Abstract:
Background: Marijuana is the most widely used illicit drug worldwide, especially among adolescents and young adults, which can cause numerous complications. The aim of this study was to determine the pattern, motivation use, and factors related to marijuana use among Iranian youths based on the integrative model of behavioral prediction Methods: A cross-sectional study was conducted among 174 youths marijuana user in Kermanshah County and Isfahan County, during summer 2014 which was selected with the convenience sampling for participation in this study. A self-reporting questionnaire was applied for collecting data. Data were analyzed by SPSS version 21 using bivariate correlations and linear regression statistical tests. Results: The mean marijuana use of respondents was 4.60 times at during week [95% CI: 4.06, 5.15]. Linear regression statistical showed, the structures of integrative model of behavioral prediction accounted for 36% of the variation in the outcome measure of the marijuana use at during week (R2 = 36% & P < 0.001); and among them attitude, marijuana refuse, and subjective norms were a stronger predictors. Conclusion: Comprehensive health education and prevention programs need to emphasize on cognitive factors that predict youth’s health-related behaviors. Based on our findings it seems, designing educational and behavioral intervention for reducing positive belief about marijuana, marijuana self-efficacy refuse promotion and reduce subjective norms encourage marijuana use has an effective potential to protect youths marijuana use.Keywords: marijuana, youth, integrative model of behavioral prediction, Iran
Procedia PDF Downloads 55418589 A Study on Characteristics of Hedonic Price Models in Korea Based on Meta-Regression Analysis
Authors: Minseo Jo
Abstract:
The purpose of this paper is to examine the factors in the hedonic price models, that has significance impact in determining the price of apartments. There are many variables employed in the hedonic price models and their effectiveness vary differently according to the researchers and the regions they are analysing. In order to consider various conditions, the meta-regression analysis has been selected for the study. In this paper, four meta-independent variables, from the 65 hedonic price models to analysis. The factors that influence the prices of apartments, as well as including factors that influence the prices of apartments, regions, which are divided into two of the research performed, years of research performed, the coefficients of the functions employed. The covariance between the four meta-variables and p-value of the coefficients and the four meta-variables and number of data used in the 65 hedonic price models have been analyzed in this study. The six factors that are most important in deciding the prices of apartments are positioning of apartments, the noise of the apartments, points of the compass and views from the apartments, proximity to the public transportations, companies that have constructed the apartments, social environments (such as schools etc.).Keywords: hedonic price model, housing price, meta-regression analysis, characteristics
Procedia PDF Downloads 40218588 Thermal Buckling of Functionally Graded Panel Based on Mori-Tanaka Scheme
Authors: Seok-In Bae, Young-Hoon Lee, Ji-Hwan Kim
Abstract:
Due to the asymmetry of the material properties of the Functionally Graded Materials(FGMs) in the thickness direction, neutral surface of the model is not the same as the mid-plane of the symmetric structure. In order to investigate the thermal bucking behavior of FGMs, neutral surface is chosen as a reference plane. In the model, material properties are assumed to be temperature dependent, and varied continuously in the thickness direction of the plate. Further, the effective material properties such as Young’s modulus and Poisson’s ratio are homogenized using Mori-Tanaka scheme which considers the interaction among adjacent inclusions. In this work, the finite element methods are used, and the first-order shear deformation theory of plate are accounted. The thermal loads are assumed to be uniform, linear and non-linear distribution through the thickness directions, respectively. Also, the effects of various parameters for thermal buckling behavior of FGM panel are discussed in detail.Keywords: functionally graded plate, thermal buckling analysis, neutral surface
Procedia PDF Downloads 40118587 Forecast of the Small Wind Turbines Sales with Replacement Purchases and with or without Account of Price Changes
Authors: V. Churkin, M. Lopatin
Abstract:
The purpose of the paper is to estimate the US small wind turbines market potential and forecast the small wind turbines sales in the US. The forecasting method is based on the application of the Bass model and the generalized Bass model of innovations diffusion under replacement purchases. In the work an exponential distribution is used for modeling of replacement purchases. Only one parameter of such distribution is determined by average lifetime of small wind turbines. The identification of the model parameters is based on nonlinear regression analysis on the basis of the annual sales statistics which has been published by the American Wind Energy Association (AWEA) since 2001 up to 2012. The estimation of the US average market potential of small wind turbines (for adoption purchases) without account of price changes is 57080 (confidence interval from 49294 to 64866 at P = 0.95) under average lifetime of wind turbines 15 years, and 62402 (confidence interval from 54154 to 70648 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 90,7%, while in the second - 91,8%. The effect of the wind turbines price changes on their sales was estimated using generalized Bass model. This required a price forecast. To do this, the polynomial regression function, which is based on the Berkeley Lab statistics, was used. The estimation of the US average market potential of small wind turbines (for adoption purchases) in that case is 42542 (confidence interval from 32863 to 52221 at P = 0.95) under average lifetime of wind turbines 15 years, and 47426 (confidence interval from 36092 to 58760 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 95,3%, while in the second –95,3%.Keywords: bass model, generalized bass model, replacement purchases, sales forecasting of innovations, statistics of sales of small wind turbines in the United States
Procedia PDF Downloads 348