Search results for: polynomial regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18666

Search results for: polynomial regression model

18456 Economic Analysis of Cowpea (Unguiculata spp) Production in Northern Nigeria: A Case Study of Kano Katsina and Jigawa States

Authors: Yakubu Suleiman, S. A. Musa

Abstract:

Nigeria is the largest cowpea producer in the world, accounting for about 45%, followed by Brazil with about 17%. Cowpea is grown in Kano, Bauchi, Katsina, Borno in the north, Oyo in the west, and to the lesser extent in Enugu in the east. This study was conducted to determine the input–output relationship of Cowpea production in Kano, Katsina, and Jigawa states of Nigeria. The data were collected with the aid of 1000 structured questionnaires that were randomly distributed to Cowpea farmers in the three states mentioned above of the study area. The data collected were analyzed using regression analysis (Cobb–Douglass production function model). The result of the regression analysis revealed the coefficient of multiple determinations, R2, to be 72.5% and the F ration to be 106.20 and was found to be significant (P < 0.01). The regression coefficient of constant is 0.5382 and is significant (P < 0.01). The regression coefficient with respect to labor and seeds were 0.65554 and 0.4336, respectively, and they are highly significant (P < 0.01). The regression coefficient with respect to fertilizer is 0.26341 which is significant (P < 0.05). This implies that a unit increase of any one of the variable inputs used while holding all other variables inputs constants, will significantly increase the total Cowpea output by their corresponding coefficient. This indicated that farmers in the study area are operating in stage II of the production function. The result revealed that Cowpea farmer in Kano, Jigawa and Katsina States realized a profit of N15,997, N34,016 and N19,788 per hectare respectively. It is hereby recommended that more attention should be given to Cowpea production by government and research institutions.

Keywords: coefficient, constant, inputs, regression

Procedia PDF Downloads 404
18455 Climate Changes in Albania and Their Effect on Cereal Yield

Authors: Lule Basha, Eralda Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.

Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest

Procedia PDF Downloads 84
18454 A Model for Diagnosis and Prediction of Coronavirus Using Neural Network

Authors: Sajjad Baghernezhad

Abstract:

Meta-heuristic and hybrid algorithms have high adeer in modeling medical problems. In this study, a neural network was used to predict covid-19 among high-risk and low-risk patients. This study was conducted to collect the applied method and its target population consisting of 550 high-risk and low-risk patients from the Kerman University of medical sciences medical center to predict the coronavirus. In this study, the memetic algorithm, which is a combination of a genetic algorithm and a local search algorithm, has been used to update the weights of the neural network and develop the accuracy of the neural network. The initial study showed that the accuracy of the neural network was 88%. After updating the weights, the memetic algorithm increased by 93%. For the proposed model, sensitivity, specificity, positive predictivity value, value/accuracy to 97.4, 92.3, 95.8, 96.2, and 0.918, respectively; for the genetic algorithm model, 87.05, 9.20 7, 89.45, 97.30 and 0.967 and for logistic regression model were 87.40, 95.20, 93.79, 0.87 and 0.916. Based on the findings of this study, neural network models have a lower error rate in the diagnosis of patients based on individual variables and vital signs compared to the regression model. The findings of this study can help planners and health care providers in signing programs and early diagnosis of COVID-19 or Corona.

Keywords: COVID-19, decision support technique, neural network, genetic algorithm, memetic algorithm

Procedia PDF Downloads 63
18453 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 273
18452 Response Surface Methodology to Obtain Disopyramide Phosphate Loaded Controlled Release Ethyl Cellulose Microspheres

Authors: Krutika K. Sawant, Anil Solanki

Abstract:

The present study deals with the preparation and optimization of ethyl cellulose-containing disopyramide phosphate loaded microspheres using solvent evaporation technique. A central composite design consisting of a two-level full factorial design superimposed on a star design was employed for optimizing the preparation microspheres. The drug:polymer ratio (X1) and speed of the stirrer (X2) were chosen as the independent variables. The cumulative release of the drug at a different time (2, 6, 10, 14, and 18 hr) was selected as the dependent variable. An optimum polynomial equation was generated for the prediction of the response variable at time 10 hr. Based on the results of multiple linear regression analysis and F statistics, it was concluded that sustained action can be obtained when X1 and X2 are kept at high levels. The X1X2 interaction was found to be statistically significant. The drug release pattern fitted the Higuchi model well. The data of a selected batch were subjected to an optimization study using Box-Behnken design, and an optimal formulation was fabricated. Good agreement was observed between the predicted and the observed dissolution profiles of the optimal formulation.

Keywords: disopyramide phosphate, ethyl cellulose, microspheres, controlled release, Box-Behnken design, factorial design

Procedia PDF Downloads 448
18451 Analysis of Spatial Heterogeneity of Residential Prices in Guangzhou: An Actual Study Based on Point of Interest Geographically Weighted Regression Model

Authors: Zichun Guo

Abstract:

Guangzhou's house price has long been lower than the other three major cities; with the gradual increase in Guangzhou's house price, the influencing factors of house price have gradually been paid attention to; this paper tries to use house price data and POI (Point of Interest) data, and explores the distribution of house price and influencing factors by applying the Kriging spatial interpolation method and geographically weighted regression model in ArcGIS. The results show that the interpolation result of house price has a significant relationship with the economic development and development potential of the region and that different POI types have different impacts on the growth of house prices in different regions.

Keywords: POI, house price, spatial heterogeneity, Guangzhou

Procedia PDF Downloads 43
18450 Minimizing the Impact of Covariate Detection Limit in Logistic Regression

Authors: Shahadut Hossain, Jacek Wesolowski, Zahirul Hoque

Abstract:

In many epidemiological and environmental studies covariate measurements are subject to the detection limit. In most applications, covariate measurements are usually truncated from below which is known as left-truncation. Because the measuring device, which we use to measure the covariate, fails to detect values falling below the certain threshold. In regression analyses, it causes inflated bias and inaccurate mean squared error (MSE) to the estimators. This paper suggests a response-based regression calibration method to correct the deleterious impact introduced by the covariate detection limit in the estimators of the parameters of simple logistic regression model. Compared to the maximum likelihood method, the proposed method is computationally simpler, and hence easier to implement. It is robust to the violation of distributional assumption about the covariate of interest. In producing correct inference, the performance of the proposed method compared to the other competing methods has been investigated through extensive simulations. A real-life application of the method is also shown using data from a population-based case-control study of non-Hodgkin lymphoma.

Keywords: environmental exposure, detection limit, left truncation, bias, ad-hoc substitution

Procedia PDF Downloads 229
18449 Free Fatty Acid Assessment of Crude Palm Oil Using a Non-Destructive Approach

Authors: Siti Nurhidayah Naqiah Abdull Rani, Herlina Abdul Rahim, Rashidah Ghazali, Noramli Abdul Razak

Abstract:

Near infrared (NIR) spectroscopy has always been of great interest in the food and agriculture industries. The development of prediction models has facilitated the estimation process in recent years. In this study, 110 crude palm oil (CPO) samples were used to build a free fatty acid (FFA) prediction model. 60% of the collected data were used for training purposes and the remaining 40% used for testing. The visible peaks on the NIR spectrum were at 1725 nm and 1760 nm, indicating the existence of the first overtone of C-H bands. Principal component regression (PCR) was applied to the data in order to build this mathematical prediction model. The optimal number of principal components was 10. The results showed R2=0.7147 for the training set and R2=0.6404 for the testing set.

Keywords: palm oil, fatty acid, NIRS, regression

Procedia PDF Downloads 499
18448 Non-Methane Hydrocarbons Emission during the Photocopying Process

Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana

Abstract:

The prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role of air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three-time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and micro-climates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389, and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of non-methane hydrocarbons and micro-climates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variation of variables and thus obtain more accurate knowledge of their mutual relations.

Keywords: non-methane hydrocarbons, photocopying process, multiple regression analysis, indoor air quality, pollutant emission

Procedia PDF Downloads 374
18447 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 365
18446 Cryptographic Attack on Lucas Based Cryptosystems Using Chinese Remainder Theorem

Authors: Tze Jin Wong, Lee Feng Koo, Pang Hung Yiu

Abstract:

Lenstra’s attack uses Chinese remainder theorem as a tool and requires a faulty signature to be successful. This paper reports on the security responses of fourth and sixth order Lucas based (LUC4,6) cryptosystem under the Lenstra’s attack as compared to the other two Lucas based cryptosystems such as LUC and LUC3 cryptosystems. All the Lucas based cryptosystems were exposed mathematically to the Lenstra’s attack using Chinese Remainder Theorem and Dickson polynomial. Result shows that the possibility for successful Lenstra’s attack is less against LUC4,6 cryptosystem than LUC3 and LUC cryptosystems. Current study concludes that LUC4,6 cryptosystem is more secure than LUC and LUC3 cryptosystems in sustaining against Lenstra’s attack.

Keywords: Lucas sequence, Dickson polynomial, faulty signature, corresponding signature, congruence

Procedia PDF Downloads 159
18445 Multicollinearity and MRA in Sustainability: Application of the Raise Regression

Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez

Abstract:

Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.

Keywords: multicollinearity, MRA, interaction, raise

Procedia PDF Downloads 99
18444 Model Averaging in a Multiplicative Heteroscedastic Model

Authors: Alan Wan

Abstract:

In recent years, the body of literature on frequentist model averaging in statistics has grown significantly. Most of this work focuses on models with different mean structures but leaves out the variance consideration. In this paper, we consider a regression model with multiplicative heteroscedasticity and develop a model averaging method that combines maximum likelihood estimators of unknown parameters in both the mean and variance functions of the model. Our weight choice criterion is based on a minimisation of a plug-in estimator of the model average estimator's squared prediction risk. We prove that the new estimator possesses an asymptotic optimality property. Our investigation of finite-sample performance by simulations demonstrates that the new estimator frequently exhibits very favourable properties compared to some existing heteroscedasticity-robust model average estimators. The model averaging method hedges against the selection of very bad models and serves as a remedy to variance function misspecification, which often discourages practitioners from modeling heteroscedasticity altogether. The proposed model average estimator is applied to the analysis of two real data sets.

Keywords: heteroscedasticity-robust, model averaging, multiplicative heteroscedasticity, plug-in, squared prediction risk

Procedia PDF Downloads 372
18443 A Machine Learning Model for Predicting Students’ Academic Performance in Higher Institutions

Authors: Emmanuel Osaze Oshoiribhor, Adetokunbo MacGregor John-Otumu

Abstract:

There has been a need in recent years to predict student academic achievement prior to graduation. This is to assist them in improving their grades, especially for those who have struggled in the past. The purpose of this research is to use supervised learning techniques to create a model that predicts student academic progress. Many scholars have developed models that predict student academic achievement based on characteristics including smoking, demography, culture, social media, parent educational background, parent finances, and family background, to mention a few. This element, as well as the model used, could have misclassified the kids in terms of their academic achievement. As a prerequisite to predicting if the student will perform well in the future on related courses, this model is built using a logistic regression classifier with basic features such as the previous semester's course score, attendance to class, class participation, and the total number of course materials or resources the student is able to cover per semester. With a 96.7 percent accuracy, the model outperformed other classifiers such as Naive bayes, Support vector machine (SVM), Decision Tree, Random forest, and Adaboost. This model is offered as a desktop application with user-friendly interfaces for forecasting student academic progress for both teachers and students. As a result, both students and professors are encouraged to use this technique to predict outcomes better.

Keywords: artificial intelligence, ML, logistic regression, performance, prediction

Procedia PDF Downloads 102
18442 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 376
18441 Poverty Dynamics in Thailand: Evidence from Household Panel Data

Authors: Nattabhorn Leamcharaskul

Abstract:

This study aims to examine determining factors of the dynamics of poverty in Thailand by using panel data of 3,567 households in 2007-2017. Four techniques of estimation are employed to analyze the situation of poverty across households and time periods: the multinomial logit model, the sequential logit model, the quantile regression model, and the difference in difference model. Households are categorized based on their experiences into 5 groups, namely chronically poor, falling into poverty, re-entering into poverty, exiting from poverty and never poor households. Estimation results emphasize the effects of demographic and socioeconomic factors as well as unexpected events on the economic status of a household. It is found that remittances have positive impact on household’s economic status in that they are likely to lower the probability of falling into poverty or trapping in poverty while they tend to increase the probability of exiting from poverty. In addition, not only receiving a secondary source of household income can raise the probability of being a never poor household, but it also significantly increases household income per capita of the chronically poor and falling into poverty households. Public work programs are recommended as an important tool to relieve household financial burden and uncertainty and thus consequently increase a chance for households to escape from poverty.

Keywords: difference in difference, dynamic, multinomial logit model, panel data, poverty, quantile regression, remittance, sequential logit model, Thailand, transfer

Procedia PDF Downloads 109
18440 Binary Logistic Regression Model in Predicting the Employability of Senior High School Graduates

Authors: Cromwell F. Gopo, Joy L. Picar

Abstract:

This study aimed to predict the employability of senior high school graduates for S.Y. 2018- 2019 in the Davao del Norte Division through quantitative research design using the descriptive status and predictive approaches among the indicated parameters, namely gender, school type, academics, academic award recipient, skills, values, and strand. The respondents of the study were the 33 secondary schools offering senior high school programs identified through simple random sampling, which resulted in 1,530 cases of graduates’ secondary data, which were analyzed using frequency, percentage, mean, standard deviation, and binary logistic regression. Results showed that the majority of the senior high school graduates who come from large schools were females. Further, less than half of these graduates received any academic award in any semester. In general, the graduates’ performance in academics, skills, and values were proficient. Moreover, less than half of the graduates were not employed. Then, those who were employed were either contractual, casual, or part-time workers dominated by GAS graduates. Further, the predictors of employability were gender and the Information and Communications Technology (ICT) strand, while the remaining variables did not add significantly to the model. The null hypothesis had been rejected as the coefficients of the predictors in the binary logistic regression equation did not take the value of 0. After utilizing the model, it was concluded that Technical-Vocational-Livelihood (TVL) graduates except ICT had greater estimates of employability.

Keywords: employability, senior high school graduates, Davao del Norte, Philippines

Procedia PDF Downloads 144
18439 Determining the Factors Affecting Social Media Addiction (Virtual Tolerance, Virtual Communication), Phubbing, and Perception of Addiction in Nurses

Authors: Fatima Zehra Allahverdi, Nukhet Bayer

Abstract:

Objective: Three questions were formulated to examine stressful working units (intensive care units, emergency unit nurses) utilizing the self-perception theory and social support theory. This study provides a distinctive input by inspecting the combination of variables regarding stressful working environments. Method: The descriptive research was conducted with the participation of 400 nurses working at Ankara City Hospital. The study used Multivariate Analysis of Variance (MANOVA), regression analysis, and a mediation model. Hypothesis one used MANOVA followed by a Scheffe post hoc test. Hypothesis two utilized regression analysis using a hierarchical linear regression model. Hypothesis three used a mediation model. Result: The study utilized mediation analyses. Findings supported the hypotheses that intensive care units have significantly high scores in virtual communication and virtual tolerance. The number of years on the job, virtual communication, virtual tolerance, and phubbing significantly predicted 51% of the variance of perception of addiction. Interestingly, the number of years on the job, while significant, was negatively related to perception of addiction. Conclusion: The reasoning behind these findings and the lack of significance in the emergency unit is discussed. Around 7% of the variance of phubbing was accounted for through working in intensive care units. The model accounted for 26.80 % of the differences in the perception of addiction.

Keywords: phubbing, social media, working units, years on the job, stress

Procedia PDF Downloads 48
18438 Friction Behavior of Wood-Plastic Composites against Uncoated Cemented Carbide

Authors: Almontas Vilutis, Vytenis Jankauskas

Abstract:

The paper presents the results of the investigation of the dry sliding friction of wood-plastic composites (WPCs) against WC-Co cemented carbide. The dependence of the dynamic coefficient of friction on the main influencing factors (vertical load, temperature, and sliding distance) was investigated by evaluating their mutual interaction. Multiple regression analysis showed a high polynomial dependence (adjusted R2 > 0.98). The resistance of the composite to thermo-mechanical effects determines how temperature and force factors affect the magnitude of the coefficient of friction. WPC-B composite has the lowest friction and highest resistance compared to WPC-A, while composite and cemented carbide materials wear the least. Energy dispersive spectroscopy (EDS), based on elemental composition, provided important insights into the friction process.

Keywords: friction, composite, carbide, factors

Procedia PDF Downloads 77
18437 Ground Motion Modeling Using the Least Absolute Shrinkage and Selection Operator

Authors: Yildiz Stella Dak, Jale Tezcan

Abstract:

Ground motion models that relate a strong motion parameter of interest to a set of predictive seismological variables describing the earthquake source, the propagation path of the seismic wave, and the local site conditions constitute a critical component of seismic hazard analyses. When a sufficient number of strong motion records are available, ground motion relations are developed using statistical analysis of the recorded ground motion data. In regions lacking a sufficient number of recordings, a synthetic database is developed using stochastic, theoretical or hybrid approaches. Regardless of the manner the database was developed, ground motion relations are developed using regression analysis. Development of a ground motion relation is a challenging process which inevitably requires the modeler to make subjective decisions regarding the inclusion criteria of the recordings, the functional form of the model and the set of seismological variables to be included in the model. Because these decisions are critically important to the validity and the applicability of the model, there is a continuous interest on procedures that will facilitate the development of ground motion models. This paper proposes the use of the Least Absolute Shrinkage and Selection Operator (LASSO) in selecting the set predictive seismological variables to be used in developing a ground motion relation. The LASSO can be described as a penalized regression technique with a built-in capability of variable selection. Similar to the ridge regression, the LASSO is based on the idea of shrinking the regression coefficients to reduce the variance of the model. Unlike ridge regression, where the coefficients are shrunk but never set equal to zero, the LASSO sets some of the coefficients exactly to zero, effectively performing variable selection. Given a set of candidate input variables and the output variable of interest, LASSO allows ranking the input variables in terms of their relative importance, thereby facilitating the selection of the set of variables to be included in the model. Because the risk of overfitting increases as the ratio of the number of predictors to the number of recordings increases, selection of a compact set of variables is important in cases where a small number of recordings are available. In addition, identification of a small set of variables can improve the interpretability of the resulting model, especially when there is a large number of candidate predictors. A practical application of the proposed approach is presented, using more than 600 recordings from the National Geospatial-Intelligence Agency (NGA) database, where the effect of a set of seismological predictors on the 5% damped maximum direction spectral acceleration is investigated. The set of candidate predictors considered are Magnitude, Rrup, Vs30. Using LASSO, the relative importance of the candidate predictors has been ranked. Regression models with increasing levels of complexity were constructed using one, two, three, and four best predictors, and the models’ ability to explain the observed variance in the target variable have been compared. The bias-variance trade-off in the context of model selection is discussed.

Keywords: ground motion modeling, least absolute shrinkage and selection operator, penalized regression, variable selection

Procedia PDF Downloads 325
18436 BART Matching Method: Using Bayesian Additive Regression Tree for Data Matching

Authors: Gianna Zou

Abstract:

Propensity score matching (PSM), introduced by Paul R. Rosenbaum and Donald Rubin in 1983, is a popular statistical matching technique which tries to estimate the treatment effects by taking into account covariates that could impact the efficacy of study medication in clinical trials. PSM can be used to reduce the bias due to confounding variables. However, PSM assumes that the response values are normally distributed. In some cases, this assumption may not be held. In this paper, a machine learning method - Bayesian Additive Regression Tree (BART), is used as a more robust method of matching. BART can work well when models are misspecified since it can be used to model heterogeneous treatment effects. Moreover, it has the capability to handle non-linear main effects and multiway interactions. In this research, a BART Matching Method (BMM) is proposed to provide a more reliable matching method over PSM. By comparing the analysis results from PSM and BMM, BMM can perform well and has better prediction capability when the response values are not normally distributed.

Keywords: BART, Bayesian, matching, regression

Procedia PDF Downloads 140
18435 A Survey on Quasi-Likelihood Estimation Approaches for Longitudinal Set-ups

Authors: Naushad Mamode Khan

Abstract:

The Com-Poisson (CMP) model is one of the most popular discrete generalized linear models (GLMS) that handles both equi-, over- and under-dispersed data. In longitudinal context, an integer-valued autoregressive (INAR(1)) process that incorporates covariate specification has been developed to model longitudinal CMP counts. However, the joint likelihood CMP function is difficult to specify and thus restricts the likelihood based estimating methodology. The joint generalized quasilikelihood approach (GQL-I) was instead considered but is rather computationally intensive and may not even estimate the regression effects due to a complex and frequently ill conditioned covariance structure. This paper proposes a new GQL approach for estimating the regression parameters (GQLIII) that are based on a single score vector representation. The performance of GQL-III is compared with GQL-I and separate marginal GQLs (GQL-II) through some simulation experiments and is proved to yield equally efficient estimates as GQL-I and is far more computationally stable.

Keywords: longitudinal, com-Poisson, ill-conditioned, INAR(1), GLMS, GQL

Procedia PDF Downloads 350
18434 The Relationship between Coping Styles and Internet Addiction among High School Students

Authors: Adil Kaval, Digdem Muge Siyez

Abstract:

With the negative effects of internet use in a person's life, the use of the Internet has become an issue. This subject was mostly considered as internet addiction, and it was investigated. In literature, it is noteworthy that some theoretical models have been proposed to explain the reasons for internet addiction. In addition to these theoretical models, it may be thought that the coping style for stressing events can be a predictor of internet addiction. It was aimed to test with logistic regression the effect of high school students' coping styles on internet addiction levels. Sample of the study consisted of 770 Turkish adolescents (471 girls, 299 boys) selected from high schools in the 2017-2018 academic year in İzmir province. Internet Addiction Test, Coping Scale for Child and Adolescents and a demographic information form were used in this study. The results of the logistic regression analysis indicated that the model of coping styles predicted internet addiction provides a statistically significant prediction of internet addiction. Gender does not predict whether or not to be addicted to the internet. The active coping style is not effective on internet addiction levels, while the avoiding and negative coping style are effective on internet addiction levels. With this model, % 79.1 of internet addiction in high school is estimated. The Negelkerke pseudo R2 indicated that the model accounted for %35 of the total variance. The results of this study on Turkish adolescents are similar to the results of other studies in the literature. It can be argued that avoiding and negative coping styles are important risk factors in the development of internet addiction.

Keywords: adolescents, coping, internet addiction, regression analysis

Procedia PDF Downloads 172
18433 Efficient Model Order Reduction of Descriptor Systems Using Iterative Rational Krylov Algorithm

Authors: Muhammad Anwar, Ameen Ullah, Intakhab Alam Qadri

Abstract:

This study presents a technique utilizing the Iterative Rational Krylov Algorithm (IRKA) to reduce the order of large-scale descriptor systems. Descriptor systems, which incorporate differential and algebraic components, pose unique challenges in Model Order Reduction (MOR). The proposed method partitions the descriptor system into polynomial and strictly proper parts to minimize approximation errors, applying IRKA exclusively to the strictly adequate component. This approach circumvents the unbounded errors that arise when IRKA is directly applied to the entire system. A comparative analysis demonstrates the high accuracy of the reduced model and a significant reduction in computational burden. The reduced model enables more efficient simulations and streamlined controller designs. The study highlights IRKA-based MOR’s effectiveness in optimizing complex systems’ performance across various engineering applications. The proposed methodology offers a promising solution for reducing the complexity of large-scale descriptor systems while maintaining their essential characteristics and facilitating their analysis, simulation, and control design.

Keywords: model order reduction, descriptor systems, iterative rational Krylov algorithm, interpolatory model reduction, computational efficiency, projection methods, H₂-optimal model reduction

Procedia PDF Downloads 18
18432 Modeling Default Probabilities of the Chosen Czech Banks in the Time of the Financial Crisis

Authors: Petr Gurný

Abstract:

One of the most important tasks in the risk management is the correct determination of probability of default (PD) of particular financial subjects. In this paper a possibility of determination of financial institution’s PD according to the credit-scoring models is discussed. The paper is divided into the two parts. The first part is devoted to the estimation of the three different models (based on the linear discriminant analysis, logit regression and probit regression) from the sample of almost three hundred US commercial banks. Afterwards these models are compared and verified on the control sample with the view to choose the best one. The second part of the paper is aimed at the application of the chosen model on the portfolio of three key Czech banks to estimate their present financial stability. However, it is not less important to be able to estimate the evolution of PD in the future. For this reason, the second task in this paper is to estimate the probability distribution of the future PD for the Czech banks. So, there are sampled randomly the values of particular indicators and estimated the PDs’ distribution, while it’s assumed that the indicators are distributed according to the multidimensional subordinated Lévy model (Variance Gamma model and Normal Inverse Gaussian model, particularly). Although the obtained results show that all banks are relatively healthy, there is still high chance that “a financial crisis” will occur, at least in terms of probability. This is indicated by estimation of the various quantiles in the estimated distributions. Finally, it should be noted that the applicability of the estimated model (with respect to the used data) is limited to the recessionary phase of the financial market.

Keywords: credit-scoring models, multidimensional subordinated Lévy model, probability of default

Procedia PDF Downloads 449
18431 Mathematical Modeling of Drip Emitter Discharge of Trapezoidal Labyrinth Channel

Authors: N. Philipova

Abstract:

The influence of the geometric parameters of trapezoidal labyrinth channel on the emitter discharge is investigated in this work. The impact of the dentate angle, the dentate spacing, and the dentate height are studied among the geometric parameters of the labyrinth channel. Numerical simulations of the water flow movement are performed according to central cubic composite design using Commercial codes GAMBIT and FLUENT. Inlet pressure of the dripper is set up to be 1 bar. The objective of this paper is to derive a mathematical model of the emitter discharge depending on the dentate angle, the dentate spacing, the dentate height of the labyrinth channel. As a result, the obtained mathematical model is a second-order polynomial reporting 2-way interactions among the geometric parameters. The dentate spacing has the most important and positive influence on the emitter discharge, followed by the simultaneous impact of the dentate spacing and the dentate height. The dentate angle in the observed interval has no significant effect on the emitter discharge. The obtained model can be used as a basis for a future emitter design.

Keywords: drip irrigation, labyrinth channel hydrodynamics, numerical simulations, Reynolds stress model.

Procedia PDF Downloads 180
18430 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.

Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis

Procedia PDF Downloads 387
18429 Project Time Prediction Model: A Case Study of Construction Projects in Sindh, Pakistan

Authors: Tauha Hussain Ali, Shabir Hussain Khahro, Nafees Ahmed Memon

Abstract:

Accurate prediction of project time for planning and bid preparation stage should contain realistic dates. Constructors use their experience to estimate the project duration for the new projects, which is based on intuitions. It has been a constant concern to both researchers and constructors to analyze the accurate prediction of project duration for bid preparation stage. In Pakistan, such study for time cost relationship has been lacked to predict duration performance for the construction projects. This study is an attempt to explore the time cost relationship that would conclude with a mathematical model to predict the time for the drainage rehabilitation projects in the province of Sindh, Pakistan. The data has been collected from National Engineering Services (NESPAK), Pakistan and regression analysis has been carried out for the analysis of results. Significant relationship has been found between time and cost of the construction projects in Sindh and the generated mathematical model can be used by the constructors to predict the project duration for the upcoming projects of same nature. This study also provides the professionals with a requisite knowledge to make decisions regarding project duration, which is significantly important to win the projects at the bid stage.

Keywords: BTC Model, project time, relationship of time cost, regression

Procedia PDF Downloads 377
18428 A Contribution to the Polynomial Eigen Problem

Authors: Malika Yaici, Kamel Hariche, Tim Clarke

Abstract:

The relationship between eigenstructure (eigenvalues and eigenvectors) and latent structure (latent roots and latent vectors) is established. In control theory eigenstructure is associated with the state space description of a dynamic multi-variable system and a latent structure is associated with its matrix fraction description. Beginning with block controller and block observer state space forms and moving on to any general state space form, we develop the identities that relate eigenvectors and latent vectors in either direction. Numerical examples illustrate this result. A brief discussion of the potential of these identities in linear control system design follows. Additionally, we present a consequent result: a quick and easy method to solve the polynomial eigenvalue problem for regular matrix polynomials.

Keywords: eigenvalues/eigenvectors, latent values/vectors, matrix fraction description, state space description

Procedia PDF Downloads 464
18427 Modeling of Traffic Turning Movement

Authors: Michael Tilahun Mulugeta

Abstract:

Pedestrians are the most vulnerable road users as they are more exposed to the risk of collusion. Pedestrian safety at road intersections still remains the most vital and yet unsolved issue in Addis Ababa, Ethiopia. One of the critical points in pedestrian safety is the occurrence of conflict between turning vehicle and pedestrians at un-signalized intersection. However, a better understanding of the factors that affect the likelihood of the conflicts would help provide direction for countermeasures aimed at reducing the number of crashes. This paper has sorted to explore a model to describe the relation between traffic conflicts and influencing factors using Multiple Linear regression methodology. In this research the main focus is to study the interaction of turning (left & right) vehicle with pedestrian at unsignalized intersections. The specific objectives also to determine factors that affect the number of potential conflicts and develop a model of potential conflict.

Keywords: potential, regression analysis, pedestrian, conflicts

Procedia PDF Downloads 54