Search results for: auto regression
3341 Agile Software Effort Estimation Using Regression Techniques
Authors: Mikiyas Adugna
Abstract:
Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.Keywords: agile software development, effort estimation, elastic net regression, LASSO
Procedia PDF Downloads 713340 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment
Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa
Abstract:
The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score
Procedia PDF Downloads 2663339 Urban-Rural Inequality in Mexico after Nafta: A Quantile Regression Analysis
Authors: Rene Valdiviezo-Issa
Abstract:
In this paper, we use Mexico’s Households Income and Expenditures (ENIGH) survey to explain the behaviour that the urban-rural expenditure gap has had since Mexico’s incorporation to the North American Free Trade Agreement (NAFTA) in 1994 and we compare it with the latest available survey, which took place in 2014. We use real trimestral expenditure per capita (RTEPC) as the measure of welfare. We use quantile regressions and a quantile regression decomposition to describe the gap between urban and rural distributions of log RTEPC. We discover that the decrease in the difference between the urban and rural distributions of log RTEPC, or inequality, is motivated because of a deprivation of the urban areas, in very specific characteristics, rather than an improvement of the urban areas. When using the decomposition we observe that the gap is primarily brought about because differences in returns to covariates between the urban and rural areas.Keywords: quantile regression, urban-rural inequality, inequality in Mexico, income decompositon
Procedia PDF Downloads 2813338 Investigating the Factors Affecting the Innovation of Firms in Metropolitan Regions: The Case of Mashhad Metropolitan Region, Iran
Authors: Hashem Dadashpoor, Sadegh Saeidi Shirvan
Abstract:
While with the evolution of the economy towards a knowledge-based economy, innovation is a requirement for metropolitan regions, the adoption of an open innovation strategy is an option and a requirement for many industrial firms in these regions. Studies show that investing in research and development units cannot alone increase innovation. Within the framework of the theory of learning regions, this gap, which scholars call it the ‘innovation gap’, is filled with regional features of firms. This paper attempts to investigate the factors affecting the open innovation of firms in metropolitan regions, and it searches for these in territorial innovation models and, in particular, the theory of learning regions. In the next step, the effect of identified factors which is considered as regional learning factors in this research is analyzed on the innovation of sample firms by SPSS software using multiple linear regression. The case study of this research is constituted of industrial enterprises from two groups of food industry and auto parts in Toos industrial town in Mashhad metropolitan region. For data gathering of this research, interviews were conducted with managers of industrial firms using structured questionnaires. Based on this study, the effect of factors such as size of firms, inter-firm competition, the use of local labor force and institutional infrastructures were significant in the innovation of the firms studied, and 44% of the changes in the firms’ innovation occurred as a result of the change in these factors.Keywords: regional knowledge networks, learning regions, interactive learning, innovation
Procedia PDF Downloads 1793337 Developing Variable Repetitive Group Sampling Control Chart Using Regression Estimator
Authors: Liaquat Ahmad, Muhammad Aslam, Muhammad Azam
Abstract:
In this article, we propose a control chart based on repetitive group sampling scheme for the location parameter. This charting scheme is based on the regression estimator; an estimator that capitalize the relationship between the variables of interest to provide more sensitive control than the commonly used individual variables. The control limit coefficients have been estimated for different sample sizes for less and highly correlated variables. The monitoring of the production process is constructed by adopting the procedure of the Shewhart’s x-bar control chart. Its performance is verified by the average run length calculations when the shift occurs in the average value of the estimator. It has been observed that the less correlated variables have rapid false alarm rate.Keywords: average run length, control charts, process shift, regression estimators, repetitive group sampling
Procedia PDF Downloads 5653336 BART Matching Method: Using Bayesian Additive Regression Tree for Data Matching
Authors: Gianna Zou
Abstract:
Propensity score matching (PSM), introduced by Paul R. Rosenbaum and Donald Rubin in 1983, is a popular statistical matching technique which tries to estimate the treatment effects by taking into account covariates that could impact the efficacy of study medication in clinical trials. PSM can be used to reduce the bias due to confounding variables. However, PSM assumes that the response values are normally distributed. In some cases, this assumption may not be held. In this paper, a machine learning method - Bayesian Additive Regression Tree (BART), is used as a more robust method of matching. BART can work well when models are misspecified since it can be used to model heterogeneous treatment effects. Moreover, it has the capability to handle non-linear main effects and multiway interactions. In this research, a BART Matching Method (BMM) is proposed to provide a more reliable matching method over PSM. By comparing the analysis results from PSM and BMM, BMM can perform well and has better prediction capability when the response values are not normally distributed.Keywords: BART, Bayesian, matching, regression
Procedia PDF Downloads 1473335 Deep Learning Based on Image Decomposition for Restoration of Intrinsic Representation
Authors: Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Kensuke Nakamura, Dongeun Choi, Byung-Woo Hong
Abstract:
Artefacts are commonly encountered in the imaging process of clinical computed tomography (CT) where the artefact refers to any systematic discrepancy between the reconstructed observation and the true attenuation coefficient of the object. It is known that CT images are inherently more prone to artefacts due to its image formation process where a large number of independent detectors are involved, and they are assumed to yield consistent measurements. There are a number of different artefact types including noise, beam hardening, scatter, pseudo-enhancement, motion, helical, ring, and metal artefacts, which cause serious difficulties in reading images. Thus, it is desired to remove nuisance factors from the degraded image leaving the fundamental intrinsic information that can provide better interpretation of the anatomical and pathological characteristics. However, it is considered as a difficult task due to the high dimensionality and variability of data to be recovered, which naturally motivates the use of machine learning techniques. We propose an image restoration algorithm based on the deep neural network framework where the denoising auto-encoders are stacked building multiple layers. The denoising auto-encoder is a variant of a classical auto-encoder that takes an input data and maps it to a hidden representation through a deterministic mapping using a non-linear activation function. The latent representation is then mapped back into a reconstruction the size of which is the same as the size of the input data. The reconstruction error can be measured by the traditional squared error assuming the residual follows a normal distribution. In addition to the designed loss function, an effective regularization scheme using residual-driven dropout determined based on the gradient at each layer. The optimal weights are computed by the classical stochastic gradient descent algorithm combined with the back-propagation algorithm. In our algorithm, we initially decompose an input image into its intrinsic representation and the nuisance factors including artefacts based on the classical Total Variation problem that can be efficiently optimized by the convex optimization algorithm such as primal-dual method. The intrinsic forms of the input images are provided to the deep denosing auto-encoders with their original forms in the training phase. In the testing phase, a given image is first decomposed into the intrinsic form and then provided to the trained network to obtain its reconstruction. We apply our algorithm to the restoration of the corrupted CT images by the artefacts. It is shown that our algorithm improves the readability and enhances the anatomical and pathological properties of the object. The quantitative evaluation is performed in terms of the PSNR, and the qualitative evaluation provides significant improvement in reading images despite degrading artefacts. The experimental results indicate the potential of our algorithm as a prior solution to the image interpretation tasks in a variety of medical imaging applications. This work was supported by the MISP(Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by the IITP(Institute for Information and Communications Technology Promotion).Keywords: auto-encoder neural network, CT image artefact, deep learning, intrinsic image representation, noise reduction, total variation
Procedia PDF Downloads 1903334 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis
Authors: S. K. Ashiquer Rahman
Abstract:
the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model
Procedia PDF Downloads 813333 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling
Authors: Florin Leon, Silvia Curteanu
Abstract:
Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression
Procedia PDF Downloads 3043332 Confluence of Relations: An Auto-Ethnographic Account of Field Recording in the Anthropocene Age
Authors: Freya Zinovieff
Abstract:
In the age of the Anthropocene, all ecosystems, no matter how remote, is influenced by the relations between humans and technology. These influences are evidenced by current extinction rates, changes in species diversity, and species adaptation to pollution. Field recording is a tool through which we are able to document the extent to which life forms associated with the place are entangled with human-technology relationships. This paper documents the convergence of interaction between technologies, species, and landscape via an auto-ethnographic account of a field recording taken from a cell phone tower in Bali, Indonesia. In the recording, we hear a confluence of relations where critter and technology meet. The electrical hum of the tower merges with frogs and the amaranthine throb of crickets, in such a way that it is hard to tell where technology begins and the voice of creatures ends. The outcomes of this venture resulted in a framework for evaluating the sensorial relations within field recording. The framework calls for the soundscape to be understood as a multilayered ontology through which there is a convergence of multispecies relationships, or entanglements, across time and geographic location. These entanglements are not necessarily obvious. Sometimes quiet, sometimes elusive, sometimes only audible through the mediated conduit of digital technology. The paper argues that to be aware of these entanglements is to open ourselves to a type of beauty that is firmly rooted in the present paradigm of extinction and loss. By virtue of this understanding, we are bestowed with an opportunity to embrace the grave reality of the current sixth mass extinction and move forwards with what activist Joanna Macy calls the compassionate action.Keywords: anthropocene, human-technology relationships, multispecies ethnography, field recording
Procedia PDF Downloads 1503331 Particle Size Dependent Magnetic Properties of CuFe2O4 Spinel Ferrite Nanoparticles Synthesized by Starch-Assisted Sol-Gel Auto-Combustion Method
Authors: R. S. Yadav, J. Havlica, I. Kuřitka, Z. Kozakova, J. Masilko, L. Kalina, M. Hajdúchová, V. Enev, J. Wasserbauer
Abstract:
In this work, copper ferrite CuFe2O4 spinel ferrite nanoparticles with different particle size at different annealing temperature were synthesized using the starch-assisted sol-gel auto-combustion method. The synthesized nanoparticles were characterized by conventional powder X-ray diffraction (XRD) spectroscopy, Raman Spectroscopy, Fourier Transform Infrared Spectroscopy, Field-Emission Scanning Electron Microscopy, X-ray Photoelectron Spectroscopy, and Vibrating Sample Magnetometer. The XRD patterns confirmed the formation of CuFe2O4 spinel ferrite nanoparticles. Field-Emission Scanning Electron Microscopy revealed that particles are of spherical morphology with particle size 5-20 nm at lower annealing temperature. An infrared spectroscopy study showed the presence of two principal absorption bands in the frequency range around 530 cm-1 (ν1) and around 360 cm-1 (ν2); which indicate the presence of tetrahedral and octahedral group complexes, respectively, within the spinel ferrite nanoparticles. Raman spectroscopy study also indicated the change in octahedral and tetrahedral site related Raman modes in copper ferrite nanoparticles with change of particle size. This change in magnetic behavior with change of particle size of CuFe2O4 nanoparticles was also observed. The change in magnetic properties with change of particle size is due to cation redistribution, which was confirmed by X-Ray photoelectron study.Keywords: copper ferrite, nanoparticles, magnetic property, CuFe2O4
Procedia PDF Downloads 4603330 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography
Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević
Abstract:
This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis
Procedia PDF Downloads 3923329 Support Vector Regression Combined with Different Optimization Algorithms to Predict Global Solar Radiation on Horizontal Surfaces in Algeria
Authors: Laidi Maamar, Achwak Madani, Abdellah El Ahdj Abdellah
Abstract:
The aim of this work is to use Support Vector regression (SVR) combined with dragonfly, firefly, Bee Colony and particle swarm Optimization algorithm to predict global solar radiation on horizontal surfaces in some cities in Algeria. Combining these optimization algorithms with SVR aims principally to enhance accuracy by fine-tuning the parameters, speeding up the convergence of the SVR model, and exploring a larger search space efficiently; these parameters are the regularization parameter (C), kernel parameters, and epsilon parameter. By doing so, the aim is to improve the generalization and predictive accuracy of the SVR model. Overall, the aim is to leverage the strengths of both SVR and optimization algorithms to create a more powerful and effective regression model for various cities and under different climate conditions. Results demonstrate close agreement between predicted and measured data in terms of different metrics. In summary, SVM has proven to be a valuable tool in modeling global solar radiation, offering accurate predictions and demonstrating versatility when combined with other algorithms or used in hybrid forecasting models.Keywords: support vector regression (SVR), optimization algorithms, global solar radiation prediction, hybrid forecasting models
Procedia PDF Downloads 353328 Non-Linear Regression Modeling for Composite Distributions
Authors: Mostafa Aminzadeh, Min Deng
Abstract:
Modeling loss data is an important part of actuarial science. Actuaries use models to predict future losses and manage financial risk, which can be beneficial for marketing purposes. In the insurance industry, small claims happen frequently while large claims are rare. Traditional distributions such as Normal, Exponential, and inverse-Gaussian are not suitable for describing insurance data, which often show skewness and fat tails. Several authors have studied classical and Bayesian inference for parameters of composite distributions, such as Exponential-Pareto, Weibull-Pareto, and Inverse Gamma-Pareto. These models separate small to moderate losses from large losses using a threshold parameter. This research introduces a computational approach using a nonlinear regression model for loss data that relies on multiple predictors. Simulation studies were conducted to assess the accuracy of the proposed estimation method. The simulations confirmed that the proposed method provides precise estimates for regression parameters. It's important to note that this approach can be applied to datasets if goodness-of-fit tests confirm that the composite distribution under study fits the data well. To demonstrate the computations, a real data set from the insurance industry is analyzed. A Mathematica code uses the Fisher information algorithm as an iteration method to obtain the maximum likelihood estimation (MLE) of regression parameters.Keywords: maximum likelihood estimation, fisher scoring method, non-linear regression models, composite distributions
Procedia PDF Downloads 323327 Statistic Regression and Open Data Approach for Identifying Economic Indicators That Influence e-Commerce
Authors: Apollinaire Barme, Simon Tamayo, Arthur Gaudron
Abstract:
This paper presents a statistical approach to identify explanatory variables linearly related to e-commerce sales. The proposed methodology allows specifying a regression model in order to quantify the relevance between openly available data (economic and demographic) and national e-commerce sales. The proposed methodology consists in collecting data, preselecting input variables, performing regressions for choosing variables and models, testing and validating. The usefulness of the proposed approach is twofold: on the one hand, it allows identifying the variables that influence e- commerce sales with an accessible approach. And on the other hand, it can be used to model future sales from the input variables. Results show that e-commerce is linearly dependent on 11 economic and demographic indicators.Keywords: e-commerce, statistical modeling, regression, empirical research
Procedia PDF Downloads 2263326 Bioremediation Potentials of Some Indigenous Microorganisms Isolated from Auto Mechanic Workshops on Irrigation Water Used in Lokoja Kogi State of Nigeria
Authors: Emmanuel Ekpa, Adaji Andrew, Queen Opaluwa, Isreal Daraobong
Abstract:
Three (3) indigenous bacteria species (Bacillus spp, Acinectobacter spp and Moraxella spp) previously isolated from contaminated soil of some auto mechanic workshops were used for bioremediation studies on some irrigation water used at Sarkin-noma Fadama farms located in Lokoja Kogi State, Nigeria. This was done in order to investigate their bioremediation potentials using a simple pour plate method. The physicochemical parameters and heavy metal analysis (using AAS iCE 3000) of the irrigation water were performed before and after inoculation of the isolated organisms. Nitrate and phosphate concentration were found to be 10.56mg/L and 12.63mg/L prior to inoculation while iron and zinc were 0.9569mg/L and 0.2245mg/L respectively. Other physicochemical parameters were also observed to be high prior to inoculation. After the bioremediation test (inoculation with the isolated organisms), a nitrate and phosphate content of 2.53mg/L and 2.61mg/L were recorded respectively, iron and zinc gave 0.1694mg/L and 0.0174mg/L concentrations while other physicochemical parameters measured were also found to be lower in their respective values. The implication of this present study is that a number of carefully isolated indigenous bacteria species are capable of reducing the amount of heavy metal concentrations in water. Also, non-metallic contaminants like nitrate and phosphate are susceptible to bioremediation in the presence of such efficient system.Keywords: bioremediation, heavy metals, physicochemical parameters, Bacillus spp, Acinectobacter spp and Moraxella spp, AAS, spectrometer 3000
Procedia PDF Downloads 3363325 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band
Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant K. Srivastava
Abstract:
An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input-output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986, and 0.9214, respectively at HH-polarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373, and 0.9428, respectively.Keywords: bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE
Procedia PDF Downloads 4283324 A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius
Authors: Mina Adel Shokry Fahim, Jūratė Sužiedelytė Visockienė
Abstract:
With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realisation often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.Keywords: air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter
Procedia PDF Downloads 523323 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques
Authors: Jonathan Iworiso
Abstract:
Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains
Procedia PDF Downloads 1073322 Self-Image of Police Officers
Authors: Leo Carlo B. Rondina
Abstract:
Self-image is an important factor to improve the self-esteem of the personnel. The purpose of the study is to determine the self-image of the police. The respondents were the 503 policemen assigned in different Police Station in Davao City, and they were chosen with the used of random sampling. With the used of Exploratory Factor Analysis (EFA), latent construct variables of police image were identified as follows; professionalism, obedience, morality and justice and fairness. Further, ordinal regression indicates statistical characteristics on ages 21-40 which means the age of the respondent statistically improves self-image.Keywords: police image, exploratory factor analysis, ordinal regression, Galatea effect
Procedia PDF Downloads 2873321 Regression Analysis of Travel Indicators and Public Transport Usage in Urban Areas
Authors: Mehdi Moeinaddini, Zohreh Asadi-Shekari, Muhammad Zaly Shah, Amran Hamzah
Abstract:
Currently, planners try to have more green travel options to decrease economic, social and environmental problems. Therefore, this study tries to find significant urban travel factors to be used to increase the usage of alternative urban travel modes. This paper attempts to identify the relationship between prominent urban mobility indicators and daily trips by public transport in 30 cities from various parts of the world. Different travel modes, infrastructures and cost indicators were evaluated in this research as mobility indicators. The results of multi-linear regression analysis indicate that there is a significant relationship between mobility indicators and the daily usage of public transport.Keywords: green travel modes, urban travel indicators, daily trips by public transport, multi-linear regression analysis
Procedia PDF Downloads 5483320 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin
Authors: A. Ishag Mohamed, A. A. Rabah
Abstract:
The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.Keywords: N-Alkanes, N-Alkenes, nonparametric, regression
Procedia PDF Downloads 6543319 Nano-Filled Matrix Reinforced by Woven Carbon Fibers Used as a Sensor
Authors: K. Hamdi, Z. Aboura, W. Harizi, K. Khellil
Abstract:
Improving the electrical properties of organic matrix composites has been investigated in several studies. Thus, to extend the use of composites in more varied application, one of the actual barrier is their poor electrical conductivities. In the case of carbon fiber composites, organic matrix are in charge of the insulating properties of the resulting composite. However, studying the properties of continuous carbon fiber nano-filled composites is less investigated. This work tends to characterize the effect of carbon black nano-fillers on the properties of the woven carbon fiber composites. First of all, SEM observations were performed to localize the nano-particles. It showed that particles penetrated on the fiber zone (figure1). In fact, by reaching the fiber zone, the carbon black nano-fillers created network connectivity between fibers which means an easy pathway for the current. It explains the noticed improvement of the electrical conductivity of the composites by adding carbon black. This test was performed with the four points electrical circuit. It shows that electrical conductivity of 'neat' matrix composite passed from 80S/cm to 150S/cm by adding 9wt% of carbon black and to 250S/cm by adding 17wt% of the same nano-filler. Thanks to these results, the use of this composite as a strain gauge might be possible. By the way, the study of the influence of a mechanical excitation (flexion, tensile) on the electrical properties of the composite by recording the variance of an electrical current passing through the material during the mechanical testing is possible. Three different configuration were performed depending on the rate of carbon black used as nano-filler. These investigation could lead to develop an auto-instrumented material.Keywords: carbon fibers composites, nano-fillers, strain-sensors, auto-instrumented
Procedia PDF Downloads 4113318 Response Surface Methodology for the Optimization of Paddy Husker by Medium Brown Rice Peeling Machine 6 Rubber Type
Authors: S. Bangphan, P. Bangphan, C. Ketsombun, T. Sammana
Abstract:
Optimization of response surface methodology (RSM) was employed to study the effects of three factor (rubber of clearance, spindle of speed, and rice of moisture) in brown rice peeling machine of the optimal good rice yield (99.67, average of three repeats). The optimized composition derived from RSM regression was analyzed using Regression analysis and Analysis of Variance (ANOVA). At a significant level α=0.05, the values of Regression coefficient, R2 adjust were 96.55% and standard deviation were 1.05056. The independent variables are initial rubber of clearance, spindle of speed and rice of moisture parameters namely. The investigating responses are final rubber clearance, spindle of speed and moisture of rice.Keywords: brown rice, response surface methodology (RSM), peeling machine, optimization, paddy husker
Procedia PDF Downloads 5743317 On the Performance of Improvised Generalized M-Estimator in the Presence of High Leverage Collinearity Enhancing Observations
Authors: Habshah Midi, Mohammed A. Mohammed, Sohel Rana
Abstract:
Multicollinearity occurs when two or more independent variables in a multiple linear regression model are highly correlated. The ridge regression is the commonly used method to rectify this problem. However, the ridge regression cannot handle the problem of multicollinearity which is caused by high leverage collinearity enhancing observation (HLCEO). Since high leverage points (HLPs) are responsible for inducing multicollinearity, the effect of HLPs needs to be reduced by using Generalized M estimator. The existing GM6 estimator is based on the Minimum Volume Ellipsoid (MVE) which tends to swamp some low leverage points. Hence an improvised GM (MGM) estimator is presented to improve the precision of the GM6 estimator. Numerical example and simulation study are presented to show how HLPs can cause multicollinearity. The numerical results show that our MGM estimator is the most efficient method compared to some existing methods.Keywords: identification, high leverage points, multicollinearity, GM-estimator, DRGP, DFFITS
Procedia PDF Downloads 2623316 Neural Network Modelling for Turkey Railway Load Carrying Demand
Authors: Humeyra Bolakar Tosun
Abstract:
The transport sector has an undisputed place in human life. People need transport access to continuous increase day by day with growing population. The number of rail network, urban transport planning, infrastructure improvements, transportation management and other related areas is a key factor affecting our country made it quite necessary to improve the work of transportation. In this context, it plays an important role in domestic rail freight demand planning. Alternatives that the increase in the transportation field and has made it mandatory requirements such as the demand for improving transport quality. In this study generally is known and used in studies by the definition, rail freight transport, railway line length, population, energy consumption. In this study, Iron Road Load Net Demand was modeled by multiple regression and ANN methods. In this study, model dependent variable (Output) is Iron Road Load Net demand and 6 entries variable was determined. These outcome values extracted from the model using ANN and regression model results. In the regression model, some parameters are considered as determinative parameters, and the coefficients of the determinants give meaningful results. As a result, ANN model has been shown to be more successful than traditional regression model.Keywords: railway load carrying, neural network, modelling transport, transportation
Procedia PDF Downloads 1433315 Using the Bootstrap for Problems Statistics
Authors: Brahim Boukabcha, Amar Rebbouh
Abstract:
The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models
Procedia PDF Downloads 3803314 Enhancing Predictive Accuracy in Pharmaceutical Sales through an Ensemble Kernel Gaussian Process Regression Approach
Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf
Abstract:
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matern, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matern, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.Keywords: Gaussian process regression, ensemble kernels, bayesian optimization, pharmaceutical sales analysis, time series forecasting, data analysis
Procedia PDF Downloads 713313 A Meta Regression Analysis to Detect Price Premium Threshold for Eco-Labeled Seafood
Authors: Cristina Giosuè, Federica Biondo, Sergio Vitale
Abstract:
In the last years, the consumers' awareness for environmental concerns has been increasing, and seafood eco-labels are considered as a possible instrument to improve both seafood markets and sustainable fishing management. In this direction, the aim of this study was to carry out a meta-analysis on consumers’ willingness to pay (WTP) for eco-labeled wild seafood, by a meta-regression. Therefore, only papers published on ISI journals were searched on “Web of Knowledge” and “SciVerse Scopus” platforms, using the combinations of the following key words: seafood, ecolabel, eco-label, willingness, WTP and premium. The dataset was built considering: paper’s and survey’s codes, year of publication, first author’s nationality, species’ taxa and family, sample size, survey’s continent and country, data collection (where and how), gender and age of consumers, brand and ΔWTP. From analysis the interest on eco labeled seafood emerged clearly, in particular in developed countries. In general, consumers declared greater willingness to pay than that actually applied for eco-label products, with difference related to taxa and brand.Keywords: eco label, meta regression, seafood, willingness to pay
Procedia PDF Downloads 1223312 A Multi-Stage Learning Framework for Reliable and Cost-Effective Estimation of Vehicle Yaw Angle
Authors: Zhiyong Zheng, Xu Li, Liang Huang, Zhengliang Sun, Jianhua Xu
Abstract:
Yaw angle plays a significant role in many vehicle safety applications, such as collision avoidance and lane-keeping system. Although the estimation of the yaw angle has been extensively studied in existing literature, it is still the main challenge to simultaneously achieve a reliable and cost-effective solution in complex urban environments. This paper proposes a multi-stage learning framework to estimate the yaw angle with a monocular camera, which can deal with the challenge in a more reliable manner. In the first stage, an efficient road detection network is designed to extract the road region, providing a highly reliable reference for the estimation. In the second stage, a variational auto-encoder (VAE) is proposed to learn the distribution patterns of road regions, which is particularly suitable for modeling the changing patterns of yaw angle under different driving maneuvers, and it can inherently enhance the generalization ability. In the last stage, a gated recurrent unit (GRU) network is used to capture the temporal correlations of the learned patterns, which is capable to further improve the estimation accuracy due to the fact that the changes of deflection angle are relatively easier to recognize among continuous frames. Afterward, the yaw angle can be obtained by combining the estimated deflection angle and the road direction stored in a roadway map. Through effective multi-stage learning, the proposed framework presents high reliability while it maintains better accuracy. Road-test experiments with different driving maneuvers were performed in complex urban environments, and the results validate the effectiveness of the proposed framework.Keywords: gated recurrent unit, multi-stage learning, reliable estimation, variational auto-encoder, yaw angle
Procedia PDF Downloads 142