Search results for: panel data regression models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29297

Search results for: panel data regression models

29207 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 280
29206 Count Data Regression Modeling: An Application to Spontaneous Abortion in India

Authors: Prashant Verma, Prafulla K. Swain, K. K. Singh, Mukti Khetan

Abstract:

Objective: In India, around 20,000 women die every year due to abortion-related complications. In the modelling of count variables, there is sometimes a preponderance of zero counts. This article concerns the estimation of various count regression models to predict the average number of spontaneous abortion among women in the Punjab state of India. It also assesses the factors associated with the number of spontaneous abortions. Materials and methods: The study included 27,173 married women of Punjab obtained from the DLHS-4 survey (2012-13). Poisson regression (PR), Negative binomial (NB) regression, zero hurdle negative binomial (ZHNB), and zero-inflated negative binomial (ZINB) models were employed to predict the average number of spontaneous abortions and to identify the determinants affecting the number of spontaneous abortions. Results: Statistical comparisons among four estimation methods revealed that the ZINB model provides the best prediction for the number of spontaneous abortions. Antenatal care (ANC) place, place of residence, total children born to a woman, woman's education and economic status were found to be the most significant factors affecting the occurrence of spontaneous abortion. Conclusions: The study offers a practical demonstration of techniques designed to handle count variables. Statistical comparisons among four estimation models revealed that the ZINB model provided the best prediction for the number of spontaneous abortions and is recommended to be used to predict the number of spontaneous abortions. The study suggests that women receive institutional Antenatal care to attain limited parity. It also advocates promoting higher education among women in Punjab, India.

Keywords: count data, spontaneous abortion, Poisson model, negative binomial model, zero hurdle negative binomial, zero-inflated negative binomial, regression

Procedia PDF Downloads 125
29205 Modelling Conceptual Quantities Using Support Vector Machines

Authors: Ka C. Lam, Oluwafunmibi S. Idowu

Abstract:

Uncertainty in cost is a major factor affecting performance of construction projects. To our knowledge, several conceptual cost models have been developed with varying degrees of accuracy. Incorporating conceptual quantities into conceptual cost models could improve the accuracy of early predesign cost estimates. Hence, the development of quantity models for estimating conceptual quantities of framed reinforced concrete structures using supervised machine learning is the aim of the current research. Using measured quantities of structural elements and design variables such as live loads and soil bearing pressures, response and predictor variables were defined and used for constructing conceptual quantities models. Twenty-four models were developed for comparison using a combination of non-parametric support vector regression, linear regression, and bootstrap resampling techniques. R programming language was used for data analysis and model implementation. Gross soil bearing pressure and gross floor loading were discovered to have a major influence on the quantities of concrete and reinforcement used for foundations. Building footprint and gross floor loading had a similar influence on beams and slabs. Future research could explore the modelling of other conceptual quantities for walls, finishes, and services using machine learning techniques. Estimation of conceptual quantities would assist construction planners in early resource planning and enable detailed performance evaluation of early cost predictions.

Keywords: bootstrapping, conceptual quantities, modelling, reinforced concrete, support vector regression

Procedia PDF Downloads 189
29204 Optimizing Nitrogen Fertilizer Application in Rice Cultivation: A Decision Model for Top and Ear Dressing Dosages

Authors: Ya-Li Tsai

Abstract:

Nitrogen is a vital element crucial for crop growth, significantly influencing crop yield. In rice cultivation, farmers often apply substantial nitrogen fertilizer to maximize yields. However, excessive nitrogen application increases the risk of lodging and pest infestation, leading to yield losses. Additionally, conventional flooded irrigation methods consume significant water resources, necessitating precise agricultural and intelligent water management systems. In this study, it leveraged physiological data and field images captured by unmanned aerial vehicles, considering fertilizer treatment and irrigation as key factors. Statistical models incorporating rice physiological data, yield, and vegetation indices from image data were developed. Missing physiological data were addressed using multiple imputation and regression methods, and regression models were established using principal component analysis and stepwise regression. Target nitrogen accumulation at key growth stages was identified to optimize fertilizer application, with the difference between actual and target nitrogen accumulation guiding recommendations for ear dressing dosage. Field experiments conducted in 2022 validated the recommended ear dressing dosage, demonstrating no significant difference in final yield compared to traditional fertilizer levels under alternate wetting and drying irrigation. These findings highlight the efficacy of applying recommended dosages based on fertilizer decision models, offering the potential for reduced fertilizer use while maintaining yield in rice cultivation.

Keywords: intelligent fertilizer management, nitrogen top and ear dressing fertilizer, rice, yield optimization

Procedia PDF Downloads 31
29203 Development of a Turbulent Boundary Layer Wall-pressure Fluctuations Power Spectrum Model Using a Stepwise Regression Algorithm

Authors: Zachary Huffman, Joana Rocha

Abstract:

Wall-pressure fluctuations induced by the turbulent boundary layer (TBL) developed over aircraft are a significant source of aircraft cabin noise. Since the power spectral density (PSD) of these pressure fluctuations is directly correlated with the amount of sound radiated into the cabin, the development of accurate empirical models that predict the PSD has been an important ongoing research topic. The sound emitted can be represented from the pressure fluctuations term in the Reynoldsaveraged Navier-Stokes equations (RANS). Therefore, early TBL empirical models (including those from Lowson, Robertson, Chase, and Howe) were primarily derived by simplifying and solving the RANS for pressure fluctuation and adding appropriate scales. Most subsequent models (including Goody, Efimtsov, Laganelli, Smol’yakov, and Rackl and Weston models) were derived by making modifications to these early models or by physical principles. Overall, these models have had varying levels of accuracy, but, in general, they are most accurate under the specific Reynolds and Mach numbers they were developed for, while being less accurate under other flow conditions. Despite this, recent research into the possibility of using alternative methods for deriving the models has been rather limited. More recent studies have demonstrated that an artificial neural network model was more accurate than traditional models and could be applied more generally, but the accuracy of other machine learning techniques has not been explored. In the current study, an original model is derived using a stepwise regression algorithm in the statistical programming language R, and TBL wall-pressure fluctuations PSD data gathered at the Carleton University wind tunnel. The theoretical advantage of a stepwise regression approach is that it will automatically filter out redundant or uncorrelated input variables (through the process of feature selection), and it is computationally faster than machine learning. The main disadvantage is the potential risk of overfitting. The accuracy of the developed model is assessed by comparing it to independently sourced datasets.

Keywords: aircraft noise, machine learning, power spectral density models, regression models, turbulent boundary layer wall-pressure fluctuations

Procedia PDF Downloads 113
29202 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 19
29201 High Speed Rail vs. Other Factors Affecting the Tourism Market in Italy

Authors: F. Pagliara, F. Mauriello

Abstract:

The objective of this paper is to investigate the relationship between the increase of accessibility brought by high speed rail (HSR) systems and the tourism market in Italy. The impacts of HSR projects on tourism can be quantified in different ways. In this manuscript, an empirical analysis has been carried out with the aid of a dataset containing information both on tourism and transport for 99 Italian provinces during the 2006-2016 period. Panel data regression models have been considered, since they allow modelling a wide variety of correlation patterns. Results show that HSR has an impact on the choice of a given destination for Italian tourists while the presence of a second level hub mainly affects foreign tourists. Attraction variables are also significant for both categories and the variables concerning security, such as number of crimes registered in a given destination, have a negative impact on the choice of a destination.

Keywords: tourists, overnights, high speed rail, attractions, security

Procedia PDF Downloads 130
29200 Estimating the Power Influence of an Off-Grid Photovoltaic Panel on the Indicting Rate of a Storage System (Batteries)

Authors: Osamede Asowata

Abstract:

The current resurgence of interest in the use of renewable energy is driven by the need to reduce the high environmental impact of fossil-based energy. The aim of this paper is to evaluate the effect of a stationary PV panel on the charging rate of deep-cycle valve regulated lead-acid (DCVRLA) batteries. Stationary PV panels are set to a fixed tilt and orientation angle, which plays a major role in dictating the output power of a PV panel and subsequently on the charging time of a DCVRLA battery. In a basic PV system, an energy storage device that stores the power from the PV panel is necessary due to the fluctuating nature of the PV voltage caused by climatic conditions. The charging and discharging times of a DCVRLA battery were determined for a twelve month period from January through December 2012. Preliminary results, which include regression analysis (R2), conversion-time per week and work-time per day, indicate that a 36 degrees tilt angle produces a good charging rate for a latitude of 26 degrees south throughout the year.

Keywords: tilt and orientation angles, solar chargers, PV panels, storage devices, direct solar radiation.

Procedia PDF Downloads 221
29199 Coverage Probability Analysis of WiMAX Network under Additive White Gaussian Noise and Predicted Empirical Path Loss Model

Authors: Chaudhuri Manoj Kumar Swain, Susmita Das

Abstract:

This paper explores a detailed procedure of predicting a path loss (PL) model and its application in estimating the coverage probability in a WiMAX network. For this a hybrid approach is followed in predicting an empirical PL model of a 2.65 GHz WiMAX network deployed in a suburban environment. Data collection, statistical analysis, and regression analysis are the phases of operations incorporated in this approach and the importance of each of these phases has been discussed properly. The procedure of collecting data such as received signal strength indicator (RSSI) through experimental set up is demonstrated. From the collected data set, empirical PL and RSSI models are predicted with regression technique. Furthermore, with the aid of the predicted PL model, essential parameters such as PL exponent as well as the coverage probability of the network are evaluated. This research work may assist in the process of deployment and optimisation of any cellular network significantly.

Keywords: WiMAX, RSSI, path loss, coverage probability, regression analysis

Procedia PDF Downloads 143
29198 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 305
29197 A Panel Cointegration Analysis for Macroeconomic Determinants of International Housing Market

Authors: Mei-Se Chien, Chien-Chiang Lee, Sin-Jie Cai

Abstract:

The main purpose of this paper is to investigate the long-run equilibrium and short-run dynamics of international housing prices when macroeconomic variables change. We apply the Pedroni’s, panel cointegration, using the unbalanced panel data analysis of 33 countries over the period from 1980Q1 to 2013Q1, to examine the relationships among house prices and macroeconomic variables. Our empirical results of panel data cointegration tests support the existence of a cointegration among these macroeconomic variables and house prices. Besides, the empirical results of panel DOLS further present that a 1% increase in economic activity, long-term interest rates, and construction costs cause house prices to respectively change 2.16%, -0.04%, and 0.22% in the long run. Furthermore, the increasing economic activity and the construction cost would cause stronger impacts on the house prices for lower income countries than higher income countries. The results lead to the conclusion that policy of house prices growth can be regarded as economic growth for lower income countries. Finally, in America region, the coefficient of economic activity is the highest, which displays that increasing economic activity causes a faster rise in house prices there than in other regions. There are some special cases whereby the coefficients of interest rates are significantly positive in America and Asia regions.

Keywords: house prices, macroeconomic variables, panel cointegration, dynamic OLS

Procedia PDF Downloads 357
29196 Prediction of Malawi Rainfall from Global Sea Surface Temperature Using a Simple Multiple Regression Model

Authors: Chisomo Patrick Kumbuyo, Katsuyuki Shimizu, Hiroshi Yasuda, Yoshinobu Kitamura

Abstract:

This study deals with a way of predicting Malawi rainfall from global sea surface temperature (SST) using a simple multiple regression model. Monthly rainfall data from nine stations in Malawi grouped into two zones on the basis of inter-station rainfall correlations were used in the study. Zone 1 consisted of Karonga and Nkhatabay stations, located in northern Malawi; and Zone 2 consisted of Bolero, located in northern Malawi; Kasungu, Dedza, Salima, located in central Malawi; Mangochi, Makoka and Ngabu stations located in southern Malawi. Links between Malawi rainfall and SST based on statistical correlations were evaluated and significant results selected as predictors for the regression models. The predictors for Zone 1 model were identified from the Atlantic, Indian and Pacific oceans while those for Zone 2 were identified from the Pacific Ocean. The correlation between the fit of predicted and observed rainfall values of the models were satisfactory with r=0.81 and 0.54 for Zone 1 and 2 respectively (significant at less than 99.99%). The results of the models are in agreement with other findings that suggest that SST anomalies in the Atlantic, Indian and Pacific oceans have an influence on the rainfall patterns of Southern Africa.

Keywords: Malawi rainfall, forecast model, predictors, SST

Procedia PDF Downloads 355
29195 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 356
29194 Spatial Spillovers in Forecasting Market Diffusion of Electric Mobility

Authors: Reinhold Kosfeld, Andreas Gohs

Abstract:

In the reduction of CO₂ emissions, the transition to environmentally friendly transport modes has a high significance. In Germany, the climate protection programme 2030 includes various measures for promoting electromobility. Although electric cars at present hold a market share of just over one percent, its stock more than doubled in the past two years. Special measures like tax incentives and a buyer’s premium have been put in place to promote the shift towards electric cars and boost their diffusion. Knowledge of the future expansion of electric cars is required for planning purposes and adaptation measures. With a view of these objectives, we particularly investigate the effect of spatial spillovers on forecasting performance. For this purpose, time series econometrics and panel econometric models are designed for pure electric cars and hybrid cars for Germany. Regional forecasting models with spatial interactions are consistently estimated by using spatial econometric techniques. Regional data on the stocks of electric cars and their determinants at the district level (NUTS 3 regions) are available from the Federal Motor Transport Authority (Kraftfahrt-Bundesamt) for the period 2017 - 2019. A comparative examination of aggregated regional and national predictions provides quantitative information on accuracy gains by allowing for spatial spillovers in forecasting electric mobility.

Keywords: electric mobility, forecasting market diffusion, regional panel data model, spatial interaction

Procedia PDF Downloads 131
29193 A Heteroskedasticity Robust Test for Contemporaneous Correlation in Dynamic Panel Data Models

Authors: Andreea Halunga, Chris D. Orme, Takashi Yamagata

Abstract:

This paper proposes a heteroskedasticity-robust Breusch-Pagan test of the null hypothesis of zero cross-section (or contemporaneous) correlation in linear panel-data models, without necessarily assuming independence of the cross-sections. The procedure allows for either fixed, strictly exogenous and/or lagged dependent regressor variables, as well as quite general forms of both non-normality and heteroskedasticity in the error distribution. The asymptotic validity of the test procedure is predicated on the number of time series observations, T, being large relative to the number of cross-section units, N, in that: (i) either N is fixed as T→∞; or, (ii) N²/T→0, as both T and N diverge, jointly, to infinity. Given this, it is not expected that asymptotic theory would provide an adequate guide to finite sample performance when T/N is "small". Because of this, we also propose and establish asymptotic validity of, a number of wild bootstrap schemes designed to provide improved inference when T/N is small. Across a variety of experimental designs, a Monte Carlo study suggests that the predictions from asymptotic theory do, in fact, provide a good guide to the finite sample behaviour of the test when T is large relative to N. However, when T and N are of similar orders of magnitude, discrepancies between the nominal and empirical significance levels occur as predicted by the first-order asymptotic analysis. On the other hand, for all the experimental designs, the proposed wild bootstrap approximations do improve agreement between nominal and empirical significance levels, when T/N is small, with a recursive-design wild bootstrap scheme performing best, in general, and providing quite close agreement between the nominal and empirical significance levels of the test even when T and N are of similar size. Moreover, in comparison with the wild bootstrap "version" of the original Breusch-Pagan test our experiments indicate that the corresponding version of the heteroskedasticity-robust Breusch-Pagan test appears reliable. As an illustration, the proposed tests are applied to a dynamic growth model for a panel of 20 OECD countries.

Keywords: cross-section correlation, time-series heteroskedasticity, dynamic panel data, heteroskedasticity robust Breusch-Pagan test

Procedia PDF Downloads 408
29192 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 342
29191 Automatic and High Precise Modeling for System Optimization

Authors: Stephanie Chen, Mitja Echim, Christof Büskens

Abstract:

To describe and propagate the behavior of a system mathematical models are formulated. Parameter identification is used to adapt the coefficients of the underlying laws of science. For complex systems this approach can be incomplete and hence imprecise and moreover too slow to be computed efficiently. Therefore, these models might be not applicable for the numerical optimization of real systems, since these techniques require numerous evaluations of the models. Moreover not all quantities necessary for the identification might be available and hence the system must be adapted manually. Therefore, an approach is described that generates models that overcome the before mentioned limitations by not focusing on physical laws, but on measured (sensor) data of real systems. The approach is more general since it generates models for every system detached from the scientific background. Additionally, this approach can be used in a more general sense, since it is able to automatically identify correlations in the data. The method can be classified as a multivariate data regression analysis. In contrast to many other data regression methods this variant is also able to identify correlations of products of variables and not only of single variables. This enables a far more precise and better representation of causal correlations. The basis and the explanation of this method come from an analytical background: the series expansion. Another advantage of this technique is the possibility of real-time adaptation of the generated models during operation. Herewith system changes due to aging, wear or perturbations from the environment can be taken into account, which is indispensable for realistic scenarios. Since these data driven models can be evaluated very efficiently and with high precision, they can be used in mathematical optimization algorithms that minimize a cost function, e.g. time, energy consumption, operational costs or a mixture of them, subject to additional constraints. The proposed method has successfully been tested in several complex applications and with strong industrial requirements. The generated models were able to simulate the given systems with an error in precision less than one percent. Moreover the automatic identification of the correlations was able to discover so far unknown relationships. To summarize the above mentioned approach is able to efficiently compute high precise and real-time-adaptive data-based models in different fields of industry. Combined with an effective mathematical optimization algorithm like WORHP (We Optimize Really Huge Problems) several complex systems can now be represented by a high precision model to be optimized within the user wishes. The proposed methods will be illustrated with different examples.

Keywords: adaptive modeling, automatic identification of correlations, data based modeling, optimization

Procedia PDF Downloads 376
29190 Detecting Earnings Management via Statistical and Neural Networks Techniques

Authors: Mohammad Namazi, Mohammad Sadeghzadeh Maharluie

Abstract:

Predicting earnings management is vital for the capital market participants, financial analysts and managers. The aim of this research is attempting to respond to this query: Is there a significant difference between the regression model and neural networks’ models in predicting earnings management, and which one leads to a superior prediction of it? In approaching this question, a Linear Regression (LR) model was compared with two neural networks including Multi-Layer Perceptron (MLP), and Generalized Regression Neural Network (GRNN). The population of this study includes 94 listed companies in Tehran Stock Exchange (TSE) market from 2003 to 2011. After the results of all models were acquired, ANOVA was exerted to test the hypotheses. In general, the summary of statistical results showed that the precision of GRNN did not exhibit a significant difference in comparison with MLP. In addition, the mean square error of the MLP and GRNN showed a significant difference with the multi variable LR model. These findings support the notion of nonlinear behavior of the earnings management. Therefore, it is more appropriate for capital market participants to analyze earnings management based upon neural networks techniques, and not to adopt linear regression models.

Keywords: earnings management, generalized linear regression, neural networks multi-layer perceptron, Tehran stock exchange

Procedia PDF Downloads 398
29189 Wind Fragility of Window Glass in 10-Story Apartment with Two Different Window Models

Authors: Viriyavudh Sim, WooYoung Jung

Abstract:

Damage due to high wind is not limited to load resistance components such as beam and column. The majority of damage is due to breach in the building envelope such as broken roof, window, and door. In this paper, wind fragility of window glass in residential apartment was determined to compare the difference between two window configuration models. Monte Carlo Simulation method had been used to derive damage data and analytical fragilities were constructed. Fragility of window system showed that window located in leeward wall had higher probability of failure, especially those close to the edge of structure. Between the two window models, Model 2 had higher probability of failure, this was due to the number of panel in this configuration.

Keywords: wind fragility, glass window, high rise building, wind disaster

Procedia PDF Downloads 235
29188 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 97
29187 The Impact of the Composite Expanded Graphite PCM on the PV Panel Whole Year Electric Output: Case Study Milan

Authors: Hasan A Al-Asadi, Ali Samir, Afrah Turki Awad, Ali Basem

Abstract:

Integrating the phase change material (PCM) with photovoltaic (PV) panels is one of the effective techniques to minimize the PV panel temperature and increase their electric output. In order to investigate the impact of the PCM on the electric output of the PV panels for a whole year, a lumped-distributed parameter model for the PV-PCM module has been developed. This development has considered the impact of the PCM density variation between the solid phase and liquid phase. This contribution will increase the assessment accuracy of the electric output of the PV-PCM module. The second contribution is to assess the impact of the expanded composite graphite-PCM on the PV electric output in Milan for a whole year. The novel one-dimensional model has been solved using MATLAB software. The results of this model have been validated against literature experiment work. The weather and the solar radiation data have been collected. The impact of expanded graphite-PCM on the electric output of the PV panel for a whole year has been investigated. The results indicate this impact has an enhancement rate of 2.39% for the electric output of the PV panel in Milan for a whole year.

Keywords: PV panel efficiency, PCM, numerical model, solar energy

Procedia PDF Downloads 140
29186 The Predictors of Student Engagement: Instructional Support vs Emotional Support

Authors: Tahani Salman Alangari

Abstract:

Student success can be impacted by internal factors such as their emotional well-being and external factors such as organizational support and instructional support in the classroom. This study is to identify at least one factor that forecasts student engagement. It is a cross-sectional, conducted on 6206 teachers and encompassed three years of data collection and observations of math instruction in approximately 50 schools and 300 classrooms. A multiple linear regression revealed that a model predicting student engagement from emotional support, classroom organization, and instructional support was significant. Four linear regression models were tested using hierarchical regression to examine the effects of independent variables: emotional support was the highest predictor of student engagement while instructional support was the lowest.

Keywords: student engagement, emotional support, organizational support, instructional support, well-being

Procedia PDF Downloads 53
29185 Modeling Standpipe Pressure Using Multivariable Regression Analysis by Combining Drilling Parameters and a Herschel-Bulkley Model

Authors: Seydou Sinde

Abstract:

The aims of this paper are to formulate mathematical expressions that can be used to estimate the standpipe pressure (SPP). The developed formulas take into account the main factors that, directly or indirectly, affect the behavior of SPP values. Fluid rheology and well hydraulics are some of these essential factors. Mud Plastic viscosity, yield point, flow power, consistency index, flow rate, drillstring, and annular geometries are represented by the frictional pressure (Pf), which is one of the input independent parameters and is calculated, in this paper, using Herschel-Bulkley rheological model. Other input independent parameters include the rate of penetration (ROP), applied load or weight on the bit (WOB), bit revolutions per minute (RPM), bit torque (TRQ), and hole inclination and direction coupled in the hole curvature or dogleg (DL). The technique of repeating parameters and Buckingham PI theorem are used to reduce the number of the input independent parameters into the dimensionless revolutions per minute (RPMd), the dimensionless torque (TRQd), and the dogleg, which is already in the dimensionless form of radians. Multivariable linear and polynomial regression technique using PTC Mathcad Prime 4.0 is used to analyze and determine the exact relationships between the dependent parameter, which is SPP, and the remaining three dimensionless groups. Three models proved sufficiently satisfactory to estimate the standpipe pressure: multivariable linear regression model 1 containing three regression coefficients for vertical wells; multivariable linear regression model 2 containing four regression coefficients for deviated wells; and multivariable polynomial quadratic regression model containing six regression coefficients for both vertical and deviated wells. Although that the linear regression model 2 (with four coefficients) is relatively more complex and contains an additional term over the linear regression model 1 (with three coefficients), the former did not really add significant improvements to the later except for some minor values. Thus, the effect of the hole curvature or dogleg is insignificant and can be omitted from the input independent parameters without significant losses of accuracy. The polynomial quadratic regression model is considered the most accurate model due to its relatively higher accuracy for most of the cases. Data of nine wells from the Middle East were used to run the developed models with satisfactory results provided by all of them, even if the multivariable polynomial quadratic regression model gave the best and most accurate results. Development of these models is useful not only to monitor and predict, with accuracy, the values of SPP but also to early control and check for the integrity of the well hydraulics as well as to take the corrective actions should any unexpected problems appear, such as pipe washouts, jet plugging, excessive mud losses, fluid gains, kicks, etc.

Keywords: standpipe, pressure, hydraulics, nondimensionalization, parameters, regression

Procedia PDF Downloads 58
29184 Electrical Load Estimation Using Estimated Fuzzy Linear Parameters

Authors: Bader Alkandari, Jamal Y. Madouh, Ahmad M. Alkandari, Anwar A. Alnaqi

Abstract:

A new formulation of fuzzy linear estimation problem is presented. It is formulated as a linear programming problem. The objective is to minimize the spread of the data points, taking into consideration the type of the membership function of the fuzzy parameters to satisfy the constraints on each measurement point and to insure that the original membership is included in the estimated membership. Different models are developed for a fuzzy triangular membership. The proposed models are applied to different examples from the area of fuzzy linear regression and finally to different examples for estimating the electrical load on a busbar. It had been found that the proposed technique is more suited for electrical load estimation, since the nature of the load is characterized by the uncertainty and vagueness.

Keywords: fuzzy regression, load estimation, fuzzy linear parameters, electrical load estimation

Procedia PDF Downloads 510
29183 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 73
29182 Driving Forces of Bank Liquidity: Evidence from Selected Ethiopian Private Commercial Banks

Authors: Tadele Tesfay Teame, Tsegaye Abrehame, Hágen István Zsombor

Abstract:

Liquidity is one of the main concerns for banks, and thus achieving the optimum level of liquidity is critical. The main objective of this study is to discover the driving force of selected private commercial banks’ liquidity. In order to achieve the objective explanatory research design and quantitative research approach were used. Data has been collected from a secondary source of the sampled Ethiopian private commercial banks’ financial statements, the National Bank of Ethiopia, and the Minister of Finance, the sample covering the period from 2011 to 2022. Bank-specific and macroeconomic variables were analyzed by using the balanced panel fixed effect regression model. Bank’s liquidity ratio is measured by the total liquid asset to total deposits. The findings of the study revealed that bank size, capital adequacy, loan growth rate, and non-performing loan had a statistically significant impact on private commercial banks’ liquidity, and annual inflation rate and interest rate margin had a statistically significant impact on the liquidity of Ethiopian private commercial banks measured by L1 (bank liquidity). Thus, banks in Ethiopia should not only be concerned about internal structures and policies/procedures, but they must consider both the internal environment and the macroeconomic environment together in developing their strategies to efficiently manage their liquidity position and private commercial banks to maintain their financial proficiency shall have bank liquidity management policy by assimilating both bank-specific and macro-economic variables.

Keywords: liquidity, Ethiopian private commercial banks, liquidity ratio, panel data regression analysis

Procedia PDF Downloads 54
29181 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 118
29180 Fiscal Size and Composition Effects on Growth: Empirical Evidence from Asian Economies

Authors: Jeeban Amgain

Abstract:

This paper investigates the impact of the size and composition of government expenditure and tax on GDP per capita growth in 36 Asian economies over the period of 1991-2012. The research employs the technique of panel regression; Fixed Effects and Generalized Method of Moments (GMM) as well as other statistical and descriptive approaches. The finding concludes that the size of government expenditure and tax revenue are generally low in this region. GDP per capita growth is strongly negative in response to Government expenditure, however, no significant relationship can be measured in case of size of taxation although it is positively correlated with economic growth. Panel regression of decomposed fiscal components also shows that the pattern of allocation of expenditure and taxation really matters on growth. Taxes on international trade and property have a significant positive impact on growth. In contrast, a major portion of expenditure, i.e. expenditure on general public services, health and education are found to have significant negative impact on growth, implying that government expenditures are not being productive in the Asian region for some reasons. Comparatively smaller and efficient government size would enhance the growth.

Keywords: government expenditure, tax, GDP per capita growth, composition

Procedia PDF Downloads 445
29179 Corporate Governance and Bank Performance: A Study of Selected Deposit Money Banks in Nigeria

Authors: Ayodele Ajayi, John Ajayi

Abstract:

This paper investigates the effect of corporate governance with a view to determining the relationship between board size and bank performance. Data for the study were obtained from the audited financial statements of five sampled banks listed on the Nigerian Stock Exchange. Panel data technique was adopted and analysis was carried out with the use of multiple regression and pooled ordinary least square. Results from the study show that the larger the board size, the greater the profit implying that corporate governance is positively correlated with bank performance.

Keywords: corporate governance, banks performance, board size, pooled data

Procedia PDF Downloads 317
29178 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 271