Search results for: multilevel regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3280

Search results for: multilevel regression

3100 Developing Variable Repetitive Group Sampling Control Chart Using Regression Estimator

Authors: Liaquat Ahmad, Muhammad Aslam, Muhammad Azam

Abstract:

In this article, we propose a control chart based on repetitive group sampling scheme for the location parameter. This charting scheme is based on the regression estimator; an estimator that capitalize the relationship between the variables of interest to provide more sensitive control than the commonly used individual variables. The control limit coefficients have been estimated for different sample sizes for less and highly correlated variables. The monitoring of the production process is constructed by adopting the procedure of the Shewhart’s x-bar control chart. Its performance is verified by the average run length calculations when the shift occurs in the average value of the estimator. It has been observed that the less correlated variables have rapid false alarm rate.

Keywords: average run length, control charts, process shift, regression estimators, repetitive group sampling

Procedia PDF Downloads 549
3099 BART Matching Method: Using Bayesian Additive Regression Tree for Data Matching

Authors: Gianna Zou

Abstract:

Propensity score matching (PSM), introduced by Paul R. Rosenbaum and Donald Rubin in 1983, is a popular statistical matching technique which tries to estimate the treatment effects by taking into account covariates that could impact the efficacy of study medication in clinical trials. PSM can be used to reduce the bias due to confounding variables. However, PSM assumes that the response values are normally distributed. In some cases, this assumption may not be held. In this paper, a machine learning method - Bayesian Additive Regression Tree (BART), is used as a more robust method of matching. BART can work well when models are misspecified since it can be used to model heterogeneous treatment effects. Moreover, it has the capability to handle non-linear main effects and multiway interactions. In this research, a BART Matching Method (BMM) is proposed to provide a more reliable matching method over PSM. By comparing the analysis results from PSM and BMM, BMM can perform well and has better prediction capability when the response values are not normally distributed.

Keywords: BART, Bayesian, matching, regression

Procedia PDF Downloads 135
3098 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 61
3097 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 294
3096 Investigating Role of Traumatic Events in a Pakistani Sample

Authors: Khadeeja Munawar, Shamsul Haque

Abstract:

The claim that traumatic events influence the recalled memories and mental health has received mixed empirical support. This study examines the memories of a sample drawn from Pakistan, a country that has witnessed many life-changing socio-political events, wars, and natural disasters in 72 years of its history. A sample of 210 senior citizens (Mage = 64.35, SD = 6.33) was recruited from Pakistan. The aim was to investigate if participants retrieved more memories related to past traumatic events using a word-cueing technique. Each participant reported ten memories to ten neutral cue words. The results revealed that past traumatic events were not adversely affecting the memories and mental health of participants. When memories were plotted with respect to the ages at which the events happened, a pronounced bump at 11-20 years of age was seen. Memories within as well as outside of the bump were mostly positive. The multilevel logistic regression modelling showed that the memories recalled were personally important and played a role in enhancing resilience. The findings revealed that despite facing an array of ethnic, religious, political, economic, and social conflicts, the participants were resilient, recalled predominantly positive memories, and had intact mental health. The findings have clinical implications in Cognitive Behavioral Therapy (CBT). The patients can be made aware of their negative emotions, troublesome/traumatic memories, and the distorted thinking patterns and their memories can be restructured. The findings can also be used to teach Memory Specificity Training (MEST) by psycho-educating the patients around changes in memory functioning and enhancing the recall of memories, which are more specific, vivid, and filled with sensory details.

Keywords: cognitive behavioral therapy, memories, mental health, resilience, trauma

Procedia PDF Downloads 138
3095 Contribution to Improving the DFIG Control Using a Multi-Level Inverter

Authors: Imane El Karaoui, Mohammed Maaroufi, Hamid Chaikhy

Abstract:

Doubly Fed Induction Generator (DFIG) is one of the most reliable wind generator. Major problem in wind power generation is to generate Sinusoidal signal with very low THD on variable speed caused by inverter two levels used. This paper presents a multi-level inverter whose objective is to reduce the THD and the dimensions of the output filter. This work proposes a three-level NPC-type inverter, the results simulation are presented demonstrating the efficiency of the proposed inverter.

Keywords: DFIG, multilevel inverter, NPC inverter, THD, induction machine

Procedia PDF Downloads 235
3094 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.

Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis

Procedia PDF Downloads 381
3093 Support Vector Regression Combined with Different Optimization Algorithms to Predict Global Solar Radiation on Horizontal Surfaces in Algeria

Authors: Laidi Maamar, Achwak Madani, Abdellah El Ahdj Abdellah

Abstract:

The aim of this work is to use Support Vector regression (SVR) combined with dragonfly, firefly, Bee Colony and particle swarm Optimization algorithm to predict global solar radiation on horizontal surfaces in some cities in Algeria. Combining these optimization algorithms with SVR aims principally to enhance accuracy by fine-tuning the parameters, speeding up the convergence of the SVR model, and exploring a larger search space efficiently; these parameters are the regularization parameter (C), kernel parameters, and epsilon parameter. By doing so, the aim is to improve the generalization and predictive accuracy of the SVR model. Overall, the aim is to leverage the strengths of both SVR and optimization algorithms to create a more powerful and effective regression model for various cities and under different climate conditions. Results demonstrate close agreement between predicted and measured data in terms of different metrics. In summary, SVM has proven to be a valuable tool in modeling global solar radiation, offering accurate predictions and demonstrating versatility when combined with other algorithms or used in hybrid forecasting models.

Keywords: support vector regression (SVR), optimization algorithms, global solar radiation prediction, hybrid forecasting models

Procedia PDF Downloads 18
3092 Non-Linear Regression Modeling for Composite Distributions

Authors: Mostafa Aminzadeh, Min Deng

Abstract:

Modeling loss data is an important part of actuarial science. Actuaries use models to predict future losses and manage financial risk, which can be beneficial for marketing purposes. In the insurance industry, small claims happen frequently while large claims are rare. Traditional distributions such as Normal, Exponential, and inverse-Gaussian are not suitable for describing insurance data, which often show skewness and fat tails. Several authors have studied classical and Bayesian inference for parameters of composite distributions, such as Exponential-Pareto, Weibull-Pareto, and Inverse Gamma-Pareto. These models separate small to moderate losses from large losses using a threshold parameter. This research introduces a computational approach using a nonlinear regression model for loss data that relies on multiple predictors. Simulation studies were conducted to assess the accuracy of the proposed estimation method. The simulations confirmed that the proposed method provides precise estimates for regression parameters. It's important to note that this approach can be applied to datasets if goodness-of-fit tests confirm that the composite distribution under study fits the data well. To demonstrate the computations, a real data set from the insurance industry is analyzed. A Mathematica code uses the Fisher information algorithm as an iteration method to obtain the maximum likelihood estimation (MLE) of regression parameters.

Keywords: maximum likelihood estimation, fisher scoring method, non-linear regression models, composite distributions

Procedia PDF Downloads 4
3091 Statistic Regression and Open Data Approach for Identifying Economic Indicators That Influence e-Commerce

Authors: Apollinaire Barme, Simon Tamayo, Arthur Gaudron

Abstract:

This paper presents a statistical approach to identify explanatory variables linearly related to e-commerce sales. The proposed methodology allows specifying a regression model in order to quantify the relevance between openly available data (economic and demographic) and national e-commerce sales. The proposed methodology consists in collecting data, preselecting input variables, performing regressions for choosing variables and models, testing and validating. The usefulness of the proposed approach is twofold: on the one hand, it allows identifying the variables that influence e- commerce sales with an accessible approach. And on the other hand, it can be used to model future sales from the input variables. Results show that e-commerce is linearly dependent on 11 economic and demographic indicators.

Keywords: e-commerce, statistical modeling, regression, empirical research

Procedia PDF Downloads 211
3090 Teaching Method for a Classroom of Students at Different Language Proficiency Levels: Content and Language Integrated Learning in a Japanese Culture Classroom

Authors: Yukiko Fujiwara

Abstract:

As a language learning methodology, Content and Language Integrated Learning (CLIL) has become increasingly prevalent in Japan. Most CLIL classroom practice and its research are conducted in EFL fields. However, much less research has been done in the Japanese language learning setting. Therefore, there are still many issues to work out using CLIL in the Japanese language teaching (JLT) setting. it is expected that more research will be conducted on both authentically and academically. Under such circumstances, this is one of the few classroom-based CLIL researches experiments in JLT and aims to find an effective course design for a class with students at different proficiency levels. The class was called ‘Japanese culture A’. This class was offered as one of the elective classes for International exchange students at a Japanese university. The Japanese proficiency level of the class was above the Japanese Language Proficiency Test Level N3. Since the CLIL approach places importance on ‘authenticity’, the class was designed with materials and activities; such as books, magazines, a film and TV show and a field trip to Kyoto. On the field trip, students experienced making traditional Japanese desserts, by receiving guidance directly from a Japanese artisan. Through the course, designated task sheets were used so the teacher could get feedback from each student to grasp what the class proficiency gap was. After reading an article on Japanese culture, students were asked to write down the words they did not understand and what they thought they needed to learn. It helped both students and teachers to set learning goals and work together for it. Using questionnaires and interviews with students, this research examined whether the attempt was effective or not. Essays they wrote in class were also analyzed. The results from the students were positive. They were motivated by learning authentic, natural Japanese, and they thrived setting their own personal goals. Some students were motivated to learn Japanese by studying the language and others were motivated by studying the cultural context. Most of them said they learned better this way; by setting their own Japanese language and culture goals. These results will provide teachers with new insight towards designing class materials and activities that support students in a multilevel CLIL class.

Keywords: authenticity, CLIL, Japanese language and culture, multilevel class

Procedia PDF Downloads 240
3089 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band

Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant K. Srivastava

Abstract:

An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input-output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986, and 0.9214, respectively at HH-polarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373, and 0.9428, respectively.

Keywords: bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE

Procedia PDF Downloads 413
3088 A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius

Authors: Mina Adel Shokry Fahim, Jūratė Sužiedelytė Visockienė

Abstract:

With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realisation often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.

Keywords: air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter

Procedia PDF Downloads 41
3087 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 91
3086 Self-Image of Police Officers

Authors: Leo Carlo B. Rondina

Abstract:

Self-image is an important factor to improve the self-esteem of the personnel. The purpose of the study is to determine the self-image of the police. The respondents were the 503 policemen assigned in different Police Station in Davao City, and they were chosen with the used of random sampling. With the used of Exploratory Factor Analysis (EFA), latent construct variables of police image were identified as follows; professionalism, obedience, morality and justice and fairness. Further, ordinal regression indicates statistical characteristics on ages 21-40 which means the age of the respondent statistically improves self-image.

Keywords: police image, exploratory factor analysis, ordinal regression, Galatea effect

Procedia PDF Downloads 272
3085 Regression Analysis of Travel Indicators and Public Transport Usage in Urban Areas

Authors: Mehdi Moeinaddini, Zohreh Asadi-Shekari, Muhammad Zaly Shah, Amran Hamzah

Abstract:

Currently, planners try to have more green travel options to decrease economic, social and environmental problems. Therefore, this study tries to find significant urban travel factors to be used to increase the usage of alternative urban travel modes. This paper attempts to identify the relationship between prominent urban mobility indicators and daily trips by public transport in 30 cities from various parts of the world. Different travel modes, infrastructures and cost indicators were evaluated in this research as mobility indicators. The results of multi-linear regression analysis indicate that there is a significant relationship between mobility indicators and the daily usage of public transport.

Keywords: green travel modes, urban travel indicators, daily trips by public transport, multi-linear regression analysis

Procedia PDF Downloads 534
3084 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 646
3083 Response Surface Methodology for the Optimization of Paddy Husker by Medium Brown Rice Peeling Machine 6 Rubber Type

Authors: S. Bangphan, P. Bangphan, C. Ketsombun, T. Sammana

Abstract:

Optimization of response surface methodology (RSM) was employed to study the effects of three factor (rubber of clearance, spindle of speed, and rice of moisture) in brown rice peeling machine of the optimal good rice yield (99.67, average of three repeats). The optimized composition derived from RSM regression was analyzed using Regression analysis and Analysis of Variance (ANOVA). At a significant level α=0.05, the values of Regression coefficient, R2 adjust were 96.55% and standard deviation were 1.05056. The independent variables are initial rubber of clearance, spindle of speed and rice of moisture parameters namely. The investigating responses are final rubber clearance, spindle of speed and moisture of rice.

Keywords: brown rice, response surface methodology (RSM), peeling machine, optimization, paddy husker

Procedia PDF Downloads 560
3082 Hypergraph for System of Systems modeling

Authors: Haffaf Hafid

Abstract:

Hypergraphs, after being used to model the structural organization of System of Sytems (SoS) at macroscopic level, has recent trends towards generalizing this powerful representation at different stages of complex system modelling. In this paper, we first describe different applications of hypergraph theory, and step by step, introduce multilevel modeling of SoS by means of integrating Constraint Programming Langages (CSP) dealing with engineering system reconfiguration strategy. As an application, we give an A.C.T Terminal controlled by a set of Intelligent Automated Vehicle.

Keywords: hypergraph model, structural analysis, bipartite graph, monitoring, system of systems, reconfiguration analysis, hypernetwork

Procedia PDF Downloads 475
3081 The Effect of Slum Neighborhoods on Pregnancy Outcomes in Tanzania: Secondary Analysis of the 2015-2016 Tanzania Demographic and Health Survey Data

Authors: Luisa Windhagen, Atsumi Hirose, Alex Bottle

Abstract:

Global urbanization has resulted in the expansion of slums, leaving over 10 million Tanzanians in urban poverty and at risk of poor health. Whilst rural residence has historically been associated with an increased risk of adverse pregnancy outcomes, recent studies found higher perinatal mortality rates in urban Tanzania. This study aims to understand to what extent slum neighborhoods may account for the spatial disparities seen in Tanzania. We generated a slum indicator based on UN-HABITAT criteria to identify slum clusters within the 2015-2016 Tanzania Demographic and Health Survey. Descriptive statistics, disaggregated by urban slum, urban non-slum, and rural areas, were produced. Simple and multivariable logistic regression examined the association between cluster residence type and neonatal mortality and stillbirth. For neonatal mortality, we additionally built a multilevel logistic regression model, adjusting for confounding and clustering. The neonatal mortality ratio was highest in slums (38.3 deaths per 1000 live births); the stillbirth rate was three times higher in slums (32.4 deaths per 1000 births) than in urban non-slums. Neonatal death was more likely to occur in slums than in urban non-slums (aOR=2.15, 95% CI=1.02-4.56) and rural areas (aOR=1.78, 95% CI=1.15-2.77). Odds of stillbirth were over five times higher among rural than urban non-slum residents (aOR=5.25, 95% CI=1.31-20.96). The results suggest that slums contribute to the urban disadvantage in Tanzanian neonatal health. Higher neonatal mortality in slums may be attributable to lack of education, lower socioeconomic status, poor healthcare access, and environmental factors, including indoor and outdoor air pollution and unsanitary conditions from inadequate housing. However, further research is required to ascertain specific causalities as well as significant associations between residence type and other pregnancy outcomes. The high neonatal mortality, stillbirth, and slum formation rates in Tanzania signify that considerable change is necessary to achieve international goals for health and human settlements. Disparities in access to adequate housing, safe water and sanitation, high standard antenatal, intrapartum, and neonatal care, and maternal education need to urgently be addressed. This study highlights the spatial neonatal mortality shift from rural settings to urban informal settlements in Tanzania. Importantly, other low- and middle-income countries experiencing overwhelming urbanization and slum expansion may also be at risk of a reversing trend in residential neonatal health differences.

Keywords: urban health, slum residence, neonatal mortality, stillbirth, global urbanisation

Procedia PDF Downloads 52
3080 On the Performance of Improvised Generalized M-Estimator in the Presence of High Leverage Collinearity Enhancing Observations

Authors: Habshah Midi, Mohammed A. Mohammed, Sohel Rana

Abstract:

Multicollinearity occurs when two or more independent variables in a multiple linear regression model are highly correlated. The ridge regression is the commonly used method to rectify this problem. However, the ridge regression cannot handle the problem of multicollinearity which is caused by high leverage collinearity enhancing observation (HLCEO). Since high leverage points (HLPs) are responsible for inducing multicollinearity, the effect of HLPs needs to be reduced by using Generalized M estimator. The existing GM6 estimator is based on the Minimum Volume Ellipsoid (MVE) which tends to swamp some low leverage points. Hence an improvised GM (MGM) estimator is presented to improve the precision of the GM6 estimator. Numerical example and simulation study are presented to show how HLPs can cause multicollinearity. The numerical results show that our MGM estimator is the most efficient method compared to some existing methods.

Keywords: identification, high leverage points, multicollinearity, GM-estimator, DRGP, DFFITS

Procedia PDF Downloads 243
3079 Neural Network Modelling for Turkey Railway Load Carrying Demand

Authors: Humeyra Bolakar Tosun

Abstract:

The transport sector has an undisputed place in human life. People need transport access to continuous increase day by day with growing population. The number of rail network, urban transport planning, infrastructure improvements, transportation management and other related areas is a key factor affecting our country made it quite necessary to improve the work of transportation. In this context, it plays an important role in domestic rail freight demand planning. Alternatives that the increase in the transportation field and has made it mandatory requirements such as the demand for improving transport quality. In this study generally is known and used in studies by the definition, rail freight transport, railway line length, population, energy consumption. In this study, Iron Road Load Net Demand was modeled by multiple regression and ANN methods. In this study, model dependent variable (Output) is Iron Road Load Net demand and 6 entries variable was determined. These outcome values extracted from the model using ANN and regression model results. In the regression model, some parameters are considered as determinative parameters, and the coefficients of the determinants give meaningful results. As a result, ANN model has been shown to be more successful than traditional regression model.

Keywords: railway load carrying, neural network, modelling transport, transportation

Procedia PDF Downloads 135
3078 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 371
3077 Enhancing Predictive Accuracy in Pharmaceutical Sales through an Ensemble Kernel Gaussian Process Regression Approach

Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf

Abstract:

This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matern, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matern, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.

Keywords: Gaussian process regression, ensemble kernels, bayesian optimization, pharmaceutical sales analysis, time series forecasting, data analysis

Procedia PDF Downloads 57
3076 A Meta Regression Analysis to Detect Price Premium Threshold for Eco-Labeled Seafood

Authors: Cristina Giosuè, Federica Biondo, Sergio Vitale

Abstract:

In the last years, the consumers' awareness for environmental concerns has been increasing, and seafood eco-labels are considered as a possible instrument to improve both seafood markets and sustainable fishing management. In this direction, the aim of this study was to carry out a meta-analysis on consumers’ willingness to pay (WTP) for eco-labeled wild seafood, by a meta-regression. Therefore, only papers published on ISI journals were searched on “Web of Knowledge” and “SciVerse Scopus” platforms, using the combinations of the following key words: seafood, ecolabel, eco-label, willingness, WTP and premium. The dataset was built considering: paper’s and survey’s codes, year of publication, first author’s nationality, species’ taxa and family, sample size, survey’s continent and country, data collection (where and how), gender and age of consumers, brand and ΔWTP. From analysis the interest on eco labeled seafood emerged clearly, in particular in developed countries. In general, consumers declared greater willingness to pay than that actually applied for eco-label products, with difference related to taxa and brand.

Keywords: eco label, meta regression, seafood, willingness to pay

Procedia PDF Downloads 112
3075 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models

Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales

Abstract:

The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.

Keywords: concrete bridges, deterioration, Markov chains, probability matrix

Procedia PDF Downloads 329
3074 A New Method to Estimate the Low Income Proportion: Monte Carlo Simulations

Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz

Abstract:

Estimation of a proportion has many applications in economics and social studies. A common application is the estimation of the low income proportion, which gives the proportion of people classified as poor into a population. In this paper, we present this poverty indicator and propose to use the logistic regression estimator for the problem of estimating the low income proportion. Various sampling designs are presented. Assuming a real data set obtained from the European Survey on Income and Living Conditions, Monte Carlo simulation studies are carried out to analyze the empirical performance of the logistic regression estimator under the various sampling designs considered in this paper. Results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the customary estimator under the various sampling designs considered in this paper. The stratified sampling design can also provide more accurate results.

Keywords: poverty line, risk of poverty, auxiliary variable, ratio method

Procedia PDF Downloads 444
3073 An Overbooking Model for Car Rental Service with Different Types of Cars

Authors: Naragain Phumchusri, Kittitach Pongpairoj

Abstract:

Overbooking is a very useful revenue management technique that could help reduce costs caused by either undersales or oversales. In this paper, we propose an overbooking model for two types of cars that can minimize the total cost for car rental service. With two types of cars, there is an upgrade possibility for lower type to upper type. This makes the model more complex than one type of cars scenario. We have found that convexity can be proved in this case. Sensitivity analysis of the parameters is conducted to observe the effects of relevant parameters on the optimal solution. Model simplification is proposed using multiple linear regression analysis, which can help estimate the optimal overbooking level using appropriate independent variables. The results show that the overbooking level from multiple linear regression model is relatively close to the optimal solution (with the adjusted R-squared value of at least 72.8%). To evaluate the performance of the proposed model, the total cost was compared with the case where the decision maker uses a naïve method for the overbooking level. It was found that the total cost from optimal solution is only 0.5 to 1 percent (on average) lower than the cost from regression model, while it is approximately 67% lower than the cost obtained by the naïve method. It indicates that our proposed simplification method using regression analysis can effectively perform in estimating the overbooking level.

Keywords: overbooking, car rental industry, revenue management, stochastic model

Procedia PDF Downloads 158
3072 Coverage Probability Analysis of WiMAX Network under Additive White Gaussian Noise and Predicted Empirical Path Loss Model

Authors: Chaudhuri Manoj Kumar Swain, Susmita Das

Abstract:

This paper explores a detailed procedure of predicting a path loss (PL) model and its application in estimating the coverage probability in a WiMAX network. For this a hybrid approach is followed in predicting an empirical PL model of a 2.65 GHz WiMAX network deployed in a suburban environment. Data collection, statistical analysis, and regression analysis are the phases of operations incorporated in this approach and the importance of each of these phases has been discussed properly. The procedure of collecting data such as received signal strength indicator (RSSI) through experimental set up is demonstrated. From the collected data set, empirical PL and RSSI models are predicted with regression technique. Furthermore, with the aid of the predicted PL model, essential parameters such as PL exponent as well as the coverage probability of the network are evaluated. This research work may assist in the process of deployment and optimisation of any cellular network significantly.

Keywords: WiMAX, RSSI, path loss, coverage probability, regression analysis

Procedia PDF Downloads 157
3071 Comprehensive Multilevel Practical Condition Monitoring Guidelines for Power Cables in Industries: Case Study of Mobarakeh Steel Company in Iran

Authors: S. Mani, M. Kafil, E. Asadi

Abstract:

Condition Monitoring (CM) of electrical equipment has gained remarkable importance during the recent years; due to huge production losses, substantial imposed costs and increases in vulnerability, risk and uncertainty levels. Power cables feed numerous electrical equipment such as transformers, motors, and electric furnaces; thus their condition assessment is of a very great importance. This paper investigates electrical, structural and environmental failure sources, all of which influence cables' performances and limit their uptimes; and provides a comprehensive framework entailing practical CM guidelines for maintenance of cables in industries. The multilevel CM framework presented in this study covers performance indicative features of power cables; with a focus on both online and offline diagnosis and test scenarios, and covers short-term and long-term threats to the operation and longevity of power cables. The study, after concisely overviewing the concept of CM, thoroughly investigates five major areas of power quality, Insulation Quality features of partial discharges, tan delta and voltage withstand capabilities, together with sheath faults, shield currents and environmental features of temperature and humidity; and elaborates interconnections and mutual impacts between those areas; using mathematical formulation and practical guidelines. Detection, location, and severity identification methods for every threat or fault source are also elaborated. Finally, the comprehensive, practical guidelines presented in the study are presented for the specific case of Electric Arc Furnace (EAF) feeder MV power cables in Mobarakeh Steel Company (MSC), the largest steel company in MENA region, in Iran. Specific technical and industrial characteristics and limitations of a harsh industrial environment like MSC EAF feeder cable tunnels are imposed on the presented framework; making the suggested package more practical and tangible.

Keywords: condition monitoring, diagnostics, insulation, maintenance, partial discharge, power cables, power quality

Procedia PDF Downloads 216