Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 5746

Search results for: step-wise linear regression

5626 A More Powerful Test Procedure for Multiple Hypothesis Testing

Abstract:

We propose a new multiple test called the minPOP test for testing multiple hypotheses simultaneously. Under the assumption that the test statistics are independent, we show that the minPOP test has higher global power than the existing multiple testing methods. We further propose a stepwise multiple-testing procedure based on the minPOP test and two of its modified versions (the Double Truncated and Left Truncated minPOP tests). We show that these multiple tests have strong control of the family-wise error rate (FWER). A method for finding the p-values of the proposed tests after adjusting for multiplicity is also developed. Simulation results show that the Double Truncated and Left Truncated minPOP tests, in general, have a higher number of rejections than the existing multiple testing procedures.

Keywords: multiple test, single-step procedure, stepwise procedure, p-value for multiple testing

Procedia PDF Downloads 24

5625 Measuring Multi-Class Linear Classifier for Image Classification

Authors: Fatma Susilawati Mohamad, Azizah Abdul Manaf, Fadhillah Ahmad, Zarina Mohamad, Wan Suryani Wan Awang

Abstract:

A simple and robust multi-class linear classifier is proposed and implemented. For a pair of classes of the linear boundary, a collection of segments of hyper planes created as perpendicular bisectors of line segments linking centroids of the classes or part of classes. Nearest Neighbor and Linear Discriminant Analysis are compared in the experiments to see the performances of each classifier in discriminating ripeness of oil palm. This paper proposes a multi-class linear classifier using Linear Discriminant Analysis (LDA) for image identification. Result proves that LDA is well capable in separating multi-class features for ripeness identification.

Keywords: multi-class, linear classifier, nearest neighbor, linear discriminant analysis

Procedia PDF Downloads 500

5624 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 9

5623 Sensitivity Analysis in Fuzzy Linear Programming Problems

Authors: S. H. Nasseri, A. Ebrahimnejad

Abstract:

Fuzzy set theory has been applied to many fields, such as operations research, control theory, and management sciences. In this paper, we consider two classes of fuzzy linear programming (FLP) problems: Fuzzy number linear programming and linear programming with trapezoidal fuzzy variables problems. We state our recently established results and develop fuzzy primal simplex algorithms for solving these problems. Finally, we give illustrative examples.

Keywords: fuzzy linear programming, fuzzy numbers, duality, sensitivity analysis

Procedia PDF Downloads 525

5622 Factors Related with Self-Care Behaviors among Iranian Type 2 Diabetic Patients: An Application of Health Belief Model

Authors: Ali Soroush, Mehdi Mirzaei Alavijeh, Touraj Ahmadi Jouybari, Fazel Zinat-Motlagh, Abbas Aghaei, Mari Ataee

Abstract:

Diabetes is a disease with long cardiovascular, renal, ophthalmic and neural complications. It is prevalent all around the world including Iran, and its prevalence is increasing. The aim of this study was to determine the factors related to self-care behavior based on health belief model among sample of Iranian diabetic patients. This cross-sectional study was conducted among 301 type 2 diabetic patients in Gachsaran, Iran. Data collection was based on an interview and the data were analyzed by SPSS version 20 using ANOVA, t-tests, Pearson correlation, and linear regression statistical tests at 95% significant level. Linear regression analyses showed the health belief model variables accounted for 29% of the variation in self-care behavior; and perceived severity and perceived self-efficacy are more influential predictors on self-care behavior among diabetic patients.

Keywords: diabetes, patients, self-care behaviors, health belief model

Procedia PDF Downloads 437

5621 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling

Authors: Kisan Sarda, Bhavika Shingote

Abstract:

This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.

Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation

Procedia PDF Downloads 348

5620 Robust Inference with a Skew T Distribution

Authors: M. Qamarul Islam, Ergun Dogan, Mehmet Yazici

Abstract:

There is a growing body of evidence that non-normal data is more prevalent in nature than the normal one. Examples can be quoted from, but not restricted to, the areas of Economics, Finance and Actuarial Science. The non-normality considered here is expressed in terms of fat-tailedness and asymmetry of the relevant distribution. In this study a skew t distribution that can be used to model a data that exhibit inherent non-normal behavior is considered. This distribution has tails fatter than a normal distribution and it also exhibits skewness. Although maximum likelihood estimates can be obtained by solving iteratively the likelihood equations that are non-linear in form, this can be problematic in terms of convergence and in many other respects as well. Therefore, it is preferred to use the method of modified maximum likelihood in which the likelihood estimates are derived by expressing the intractable non-linear likelihood equations in terms of standardized ordered variates and replacing the intractable terms by their linear approximations obtained from the first two terms of a Taylor series expansion about the quantiles of the distribution. These estimates, called modified maximum likelihood estimates, are obtained in closed form. Hence, they are easy to compute and to manipulate analytically. In fact the modified maximum likelihood estimates are equivalent to maximum likelihood estimates, asymptotically. Even in small samples the modified maximum likelihood estimates are found to be approximately the same as maximum likelihood estimates that are obtained iteratively. It is shown in this study that the modified maximum likelihood estimates are not only unbiased but substantially more efficient than the commonly used moment estimates or the least square estimates that are known to be biased and inefficient in such cases. Furthermore, in conventional regression analysis, it is assumed that the error terms are distributed normally and, hence, the well-known least square method is considered to be a suitable and preferred method for making the relevant statistical inferences. However, a number of empirical researches have shown that non-normal errors are more prevalent. Even transforming and/or filtering techniques may not produce normally distributed residuals. Here, a study is done for multiple linear regression models with random error having non-normal pattern. Through an extensive simulation it is shown that the modified maximum likelihood estimates of regression parameters are plausibly robust to the distributional assumptions and to various data anomalies as compared to the widely used least square estimates. Relevant tests of hypothesis are developed and are explored for desirable properties in terms of their size and power. The tests based upon modified maximum likelihood estimates are found to be substantially more powerful than the tests based upon least square estimates. Several examples are provided from the areas of Economics and Finance where such distributions are interpretable in terms of efficient market hypothesis with respect to asset pricing, portfolio selection, risk measurement and capital allocation, etc.

Keywords: least square estimates, linear regression, maximum likelihood estimates, modified maximum likelihood method, non-normality, robustness

Procedia PDF Downloads 377

5619 Student Loan Debt among Students with Disabilities

Authors: Kaycee Bills

Abstract:

This study will determine if students with disabilities have higher student loan debt payments than other student populations. The hypothesis was that students with disabilities would have significantly higher student loan debt payments than other students due to the length of time they spend in school. Using the Bachelorette and Beyond Study Wave 2015/017 dataset, quantitative methods were employed. These data analysis methods included linear regression and a correlation matrix. Due to the exploratory nature of the study, the significance levels for the overall model and each variable were set at .05. The correlation matrix demonstrated that students with certain types of disabilities are more likely to fall under higher student loan payment brackets than students without disabilities. These results also varied among the different types of disabilities. The result of the overall linear regression model was statistically significant (p = .04). Despite the overall model being statistically significant, the majority of the significance values for the different types of disabilities were null. However, several other variables had statistically significant results, such as veterans, people of minority races, and people who attended private schools. Implications for how this impacts the economy, capitalism, and financial wellbeing of various students are discussed.

Keywords: disability, student loan debt, higher education, social work

Procedia PDF Downloads 145

5618 Analysis of Risk Factors Affecting the Motor Insurance Pricing with Generalized Linear Models

Authors: Puttharapong Sakulwaropas, Uraiwan Jaroengeratikun

Abstract:

Casualty insurance business, the optimal premium pricing and adequate cost for an insurance company are important in risk management. Normally, the insurance pure premium can be determined by multiplying the claim frequency with the claim cost. The aim of this research was to study in the application of generalized linear models to select the risk factor for model of claim frequency and claim cost for estimating a pure premium. In this study, the data set was the claim of comprehensive motor insurance, which was provided by one of the insurance company in Thailand. The results of this study found that the risk factors significantly related to pure premium at the 0.05 level consisted of no claim bonus (NCB) and used of the car (Car code).

Keywords: generalized linear models, risk factor, pure premium, regression model

Procedia PDF Downloads 437

5617 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling

Procedia PDF Downloads 98

5616 On the Performance of Improvised Generalized M-Estimator in the Presence of High Leverage Collinearity Enhancing Observations

Authors: Habshah Midi, Mohammed A. Mohammed, Sohel Rana

Abstract:

Multicollinearity occurs when two or more independent variables in a multiple linear regression model are highly correlated. The ridge regression is the commonly used method to rectify this problem. However, the ridge regression cannot handle the problem of multicollinearity which is caused by high leverage collinearity enhancing observation (HLCEO). Since high leverage points (HLPs) are responsible for inducing multicollinearity, the effect of HLPs needs to be reduced by using Generalized M estimator. The existing GM6 estimator is based on the Minimum Volume Ellipsoid (MVE) which tends to swamp some low leverage points. Hence an improvised GM (MGM) estimator is presented to improve the precision of the GM6 estimator. Numerical example and simulation study are presented to show how HLPs can cause multicollinearity. The numerical results show that our MGM estimator is the most efficient method compared to some existing methods.

Keywords: identification, high leverage points, multicollinearity, GM-estimator, DRGP, DFFITS

Procedia PDF Downloads 219

5615 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health

Authors: Irfan Ahmad Afip

Abstract:

This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.

Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression

Procedia PDF Downloads 80

5614 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 421

5613 Optimizing Nitrogen Fertilizer Application in Rice Cultivation: A Decision Model for Top and Ear Dressing Dosages

Authors: Ya-Li Tsai

Abstract:

Nitrogen is a vital element crucial for crop growth, significantly influencing crop yield. In rice cultivation, farmers often apply substantial nitrogen fertilizer to maximize yields. However, excessive nitrogen application increases the risk of lodging and pest infestation, leading to yield losses. Additionally, conventional flooded irrigation methods consume significant water resources, necessitating precise agricultural and intelligent water management systems. In this study, it leveraged physiological data and field images captured by unmanned aerial vehicles, considering fertilizer treatment and irrigation as key factors. Statistical models incorporating rice physiological data, yield, and vegetation indices from image data were developed. Missing physiological data were addressed using multiple imputation and regression methods, and regression models were established using principal component analysis and stepwise regression. Target nitrogen accumulation at key growth stages was identified to optimize fertilizer application, with the difference between actual and target nitrogen accumulation guiding recommendations for ear dressing dosage. Field experiments conducted in 2022 validated the recommended ear dressing dosage, demonstrating no significant difference in final yield compared to traditional fertilizer levels under alternate wetting and drying irrigation. These findings highlight the efficacy of applying recommended dosages based on fertilizer decision models, offering the potential for reduced fertilizer use while maintaining yield in rice cultivation.

Keywords: intelligent fertilizer management, nitrogen top and ear dressing fertilizer, rice, yield optimization

Procedia PDF Downloads 22

5612 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 210

5611 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: stock market prediction, social moods, regression model, DJIA

Procedia PDF Downloads 518

5610 Survival Analysis Based Delivery Time Estimates for Display FAB

Authors: Paul Han, Jun-Geol Baek

Abstract:

In the flat panel display industry, the scheduler and dispatching system to meet production target quantities and the deadline of production are the major production management system which controls each facility production order and distribution of WIP (Work in Process). In dispatching system, delivery time is a key factor for the time when a lot can be supplied to the facility. In this paper, we use survival analysis methods to identify main factors and a forecasting model of delivery time. Of survival analysis techniques to select important explanatory variables, the cox proportional hazard model is used to. To make a prediction model, the Accelerated Failure Time (AFT) model was used. Performance comparisons were conducted with two other models, which are the technical statistics model based on transfer history and the linear regression model using same explanatory variables with AFT model. As a result, the Mean Square Error (MSE) criteria, the AFT model decreased by 33.8% compared to the existing prediction model, decreased by 5.3% compared to the linear regression model. This survival analysis approach is applicable to implementing a delivery time estimator in display manufacturing. And it can contribute to improve the productivity and reliability of production management system.

Keywords: delivery time, survival analysis, Cox PH model, accelerated failure time model

Procedia PDF Downloads 502

5609 Fed-Batch Mixotrophic Cultivation of Microalgae Scenedesmus sp., Using Airlift Photobioreactor

Authors: Lakshmidevi Rajendran, Bharathidasan Kanniappan, Gopi Raja, Muthukumar Karuppan

Abstract:

This study investigates the feasibility of fed-batch mixotrophic cultivation of microalgae Scenedesmus sp. in a 3-litre airlift photobioreactor under standard operating conditions. The results of this study suggest the algae species may serve as an excellent feed for aquatic species using organic byproducts. Microalgae Scenedesmus sp., was cultured using a synthetic wastewater by stepwise addition of crude glycerol concentration ranging from 2-10g/l under fed-batch mixotrophic mode for a period of 15 days. The attempts were made with the stepwise addition of crude glycerol as a carbon source in the initial growth phase to evade the inhibitory nature of high glycerol concentration on the growth of Scenedesmus sp. Crude glycerol was chosen since it is readily accessible as byproduct from biodiesel production sectors. Highest biomass concentration was achieved to be 2.43 g/l at the crude glycerol concentration of 6g/l after 10 days which is 3 fold times the increase in the biomass concentration compared with the control medium without the addition of glycerol. Biomass growth data obtained for the microalgae Scenedesmus sp. was fitted well with the modified Logistic equation. Substrate utilization kinetics was also employed to model the biomass productivity with respect to the various crude glycerol concentration. The results indicated that the supplement of crude glycerol to the mixotrophic culture of Scenedesmus sp., enhances the biomass concentration, chlorophyll and lutein productivity. Thus the application of fed-batch mixotrophic cultivation with stepwise addition of crude glycerol to Scenedesmus sp., provides a subtle way to reduce the production cost and improvisation in the large-scale cultivation along with biochemical compound synthesis.

Keywords: airlift photobioreactor, crude glycerol, microalgae Scenedesmus sp., mixotrophic cultivation, lutein production

Procedia PDF Downloads 148

5608 Comparing Skill, Employment, and Productivity of Industrial City Case Study: Bekasi Industrial Area and Special Economic Zone Sei Mangkei

Authors: Auliya Adzillatin Uzhma, M. Adrian Rizky, Puri Diah Santyarini

Abstract:

Bekasi Industrial Area in Kab. Bekasi and SEZ (Special Economic Zone) Sei Mangkei in Kab. Simalungun are two areas whose have the same main economic activity that are manufacturing industrial. Manufacturing industry in Bekasi Industrial Area contributes more than 70% of Kab. Bekasi’s GDP, while manufacturing industry in SEZ Sei Mangkei contributes less than 20% of Kab. Simalungun’s GDP. The dependent variable in the research is labor productivity, while the independent variable is the amount of labor, the level of labor education, the length of work and salary. This research used linear regression method to find the model for represent actual condition of productivity in two industrial area, then the equalization using dummy variable on labor education level variable. The initial hypothesis (Ho) in this research is that labor productivity in Bekasi Industrial Area will be higher than the productivity of labor in SEZ Sei Mangkei. The variable that supporting the accepted hypothesis are more labor, higher education, longer work and higher salary in Bekasi Industrial Area.

Keywords: labor, industrial city, linear regression, productivity

Procedia PDF Downloads 150

5607 Artificial Neural Network and Statistical Method

Authors: Tomas Berhanu Bekele

Abstract:

Traffic congestion is one of the main problems related to transportation in developed as well as developing countries. Traffic control systems are based on the idea of avoiding traffic instabilities and homogenizing traffic flow in such a way that the risk of accidents is minimized and traffic flow is maximized. Lately, Intelligent Transport Systems (ITS) has become an important area of research to solve such road traffic-related issues for making smart decisions. It links people, roads and vehicles together using communication technologies to increase safety and mobility. Moreover, accurate prediction of road traffic is important to manage traffic congestion. The aim of this study is to develop an ANN model for the prediction of traffic flow and to compare the ANN model with the linear regression model of traffic flow predictions. Data extraction was carried out in intervals of 15 minutes from the video player. Video of mixed traffic flow was taken and then counted during office work in order to determine the traffic volume. Vehicles were classified into six categories, namely Car, Motorcycle, Minibus, mid-bus, Bus, and Truck vehicles. The average time taken by each vehicle type to travel the trap length was measured by time displayed on a video screen.

Keywords: intelligent transport system (ITS), traffic flow prediction, artificial neural network (ANN), linear regression

Procedia PDF Downloads 17

5606 Evaluating Traffic Congestion Using the Bayesian Dirichlet Process Mixture of Generalized Linear Models

Authors: Ren Moses, Emmanuel Kidando, Eren Ozguven, Yassir Abdelrazig

Abstract:

This study applied traffic speed and occupancy to develop clustering models that identify different traffic conditions. Particularly, these models are based on the Dirichlet Process Mixture of Generalized Linear regression (DML) and change-point regression (CR). The model frameworks were implemented using 2015 historical traffic data aggregated at a 15-minute interval from an Interstate 295 freeway in Jacksonville, Florida. Using the deviance information criterion (DIC) to identify the appropriate number of mixture components, three traffic states were identified as free-flow, transitional, and congested condition. Results of the DML revealed that traffic occupancy is statistically significant in influencing the reduction of traffic speed in each of the identified states. Influence on the free-flow and the congested state was estimated to be higher than the transitional flow condition in both evening and morning peak periods. Estimation of the critical speed threshold using CR revealed that 47 mph and 48 mph are speed thresholds for congested and transitional traffic condition during the morning peak hours and evening peak hours, respectively. Free-flow speed thresholds for morning and evening peak hours were estimated at 64 mph and 66 mph, respectively. The proposed approaches will facilitate accurate detection and prediction of traffic congestion for developing effective countermeasures.

Keywords: traffic congestion, multistate speed distribution, traffic occupancy, Dirichlet process mixtures of generalized linear model, Bayesian change-point detection

Procedia PDF Downloads 262

5605 A Spectral Decomposition Method for Ordinary Differential Equation Systems with Constant or Linear Right Hand Sides

Authors: R. B. Ogunrinde, C. C. Jibunoh

Abstract:

In this paper, a spectral decomposition method is developed for the direct integration of stiff and nonstiff homogeneous linear (ODE) systems with linear, constant, or zero right hand sides (RHSs). The method does not require iteration but obtains solutions at any random points of t, by direct evaluation, in the interval of integration. All the numerical solutions obtained for the class of systems coincide with the exact theoretical solutions. In particular, solutions of homogeneous linear systems, i.e. with zero RHS, conform to the exact analytical solutions of the systems in terms of t.

Keywords: spectral decomposition, linear RHS, homogeneous linear systems, eigenvalues of the Jacobian

Procedia PDF Downloads 303

5604 Model-Based Software Regression Test Suite Reduction

Authors: Shiwei Deng, Yang Bao

Abstract:

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Keywords: dependence analysis, EFSM model, greedy algorithm, regression test

Procedia PDF Downloads 397

5603 Growth Curves Genetic Analysis of Native South Caspian Sea Poultry Using Bayesian Statistics

Authors: Jamal Fayazi, Farhad Anoosheh, Mohammad R. Ghorbani, Ali R. Paydar

Abstract:

In this study, to determine the best non-linear regression model describing the growth curve of native poultry, 9657 chicks of generations 18, 19, and 20 raised in Mazandaran breeding center were used. Fowls and roosters of this center distributed in south of Caspian Sea region. To estimate the genetic variability of none linear regression parameter of growth traits, a Gibbs sampling of Bayesian analysis was used. The average body weight traits in the first day (BW1), eighth week (BW8) and twelfth week (BW12) were respectively estimated as 36.05, 763.03, and 1194.98 grams. Based on the coefficient of determination, mean squares of error and Akaike information criteria, Gompertz model was selected as the best growth descriptive function. In Gompertz model, the estimated values for the parameters of maturity weight (A), integration constant (B) and maturity rate (K) were estimated to be 1734.4, 3.986, and 0.282, respectively. The direct heritability of BW1, BW8 and BW12 were respectively reported to be as 0.378, 0.3709, 0.316, 0.389, 0.43, 0.09 and 0.07. With regard to estimated parameters, the results of this study indicated that there is a possibility to improve some property of growth curve using appropriate selection programs.

Keywords: direct heritability, Gompertz, growth traits, maturity weight, native poultry

Procedia PDF Downloads 226

5602 Single Imputation for Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125Hz to 8000Hz. The data contains patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R\textsuperscript{2} values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R\textsuperscript{2} values for the best models for KNN ranges from .89 to .95. The best imputation models received R\textsuperscript{2} between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our best imputation models versus constant imputations by a two percent increase.

Keywords: machine learning, audiograms, data imputations, single imputations

Procedia PDF Downloads 51

5601 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 338

5600 Age Estimation from Upper Anterior Teeth by Pulp/Tooth Ratio Using Peri-Apical X-Rays among Egyptians

Authors: Fatma Mohamed Magdy Badr El Dine, Amr Mohamed Abd Allah

Abstract:

Introduction: Age estimation of individuals is one of the crucial steps in forensic practice. Different traditional methods rely on the length of the diaphysis of long bones of limbs, epiphyseal-diaphyseal union, fusion of the primary ossification centers as well as dental eruption. However, there is a growing need for the development of precise and reliable methods to estimate age, especially in cases where dismembered corpses, burnt bodies, purified or fragmented parts are recovered. Teeth are the hardest and indestructible structure in the human body. In recent years, assessment of pulp/tooth area ratio, as an indirect quantification of secondary dentine deposition has received a considerable attention. However, scanty work has been done in Egypt in terms of applicability of pulp/tooth ratio for age estimation. Aim of the Work: The present work was designed to assess the Cameriere’s method for age estimation from pulp/tooth ratio of maxillary canines, central and lateral incisors among a sample from Egyptian population. In addition, to formulate regression equations to be used as population-based standards for age determination. Material and Methods: The present study was conducted on 270 peri-apical X-rays of maxillary canines, central and lateral incisors (collected from 131 males and 139 females aged between 19 and 52 years). The pulp and tooth areas were measured using the Adobe Photoshop software program and the pulp/tooth area ratio was computed. Linear regression equations were determined separately for canines, central and lateral incisors. Results: A significant correlation was recorded between the pulp/tooth area ratio and the chronological age. The linear regression analysis revealed a coefficient of determination (R² = 0.824 for canine, 0.588 for central incisor and 0.737 for lateral incisor teeth). Three regression equations were derived. Conclusion: As a conclusion, the pulp/tooth ratio is a useful technique for estimating age among Egyptians. Additionally, the regression equation derived from canines gave better result than the incisors.

Keywords: age determination, canines, central incisors, Egypt, lateral incisors, pulp/tooth ratio

Procedia PDF Downloads 154

5599 A Stepwise Approach to Automate the Search for Optimal Parameters in Seasonal ARIMA Models

Authors: Manisha Mukherjee, Diptarka Saha

Abstract:

Reliable forecasts of univariate time series data are often necessary for several contexts. ARIMA models are quite popular among practitioners in this regard. Hence, choosing correct parameter values for ARIMA is a challenging yet imperative task. Thus, a stepwise algorithm is introduced to provide automatic and robust estimates for parameters (p; d; q)(P; D; Q) used in seasonal ARIMA models. This process is focused on improvising the overall quality of the estimates, and it alleviates the problems induced due to the unidimensional nature of the methods that are currently used such as auto.arima. The fast and automated search of parameter space also ensures reliable estimates of the parameters that possess several desirable qualities, consequently, resulting in higher test accuracy especially in the cases of noisy data. After vigorous testing on real as well as simulated data, the algorithm doesn’t only perform better than current state-of-the-art methods, it also completely obviates the need for human intervention due to its automated nature.

Keywords: time series, ARIMA, auto.arima, ARIMA parameters, forecast, R function

Procedia PDF Downloads 129

5598 Assessing Relationships between Glandularity and Gray Level by Using Breast Phantoms

Authors: Yun-Xuan Tang, Pei-Yuan Liu, Kun-Mu Lu, Min-Tsung Tseng, Liang-Kuang Chen, Yuh-Feng Tsai, Ching-Wen Lee, Jay Wu

Abstract:

Breast cancer is predominant of malignant tumors in females. The increase in the glandular density increases the risk of breast cancer. BI-RADS is a frequently used density indicator in mammography; however, it significantly overestimates the glandularity. Therefore, it is very important to accurately and quantitatively assess the glandularity by mammography. In this study, 20%, 30% and 50% glandularity phantoms were exposed using a mammography machine at 28, 30 and 31 kVp, and 30, 55, 80 and 105 mAs, respectively. The regions of interest (ROIs) were drawn to assess the gray level. The relationship between the glandularity and gray level under various compression thicknesses, kVp, and mAs was established by the multivariable linear regression. A phantom verification was performed with automatic exposure control (AEC). The regression equation was obtained with an R-square value of 0.928. The average gray levels of the verification phantom were 8708, 8660 and 8434 for 0.952, 0.963 and 0.985 g/cm3, respectively. The percent differences of glandularity to the regression equation were 3.24%, 2.75% and 13.7%. We concluded that the proposed method could be clinically applied in mammography to improve the glandularity estimation and further increase the importance of breast cancer screening.

Keywords: mammography, glandularity, gray value, BI-RADS

Procedia PDF Downloads 463

5597 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 34