Search results for: multiple regression
7061 Predicting Bridge Pier Scour Depth with SVM
Authors: Arun Goel
Abstract:
Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)
Procedia PDF Downloads 4507060 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm
Authors: Abdullah A. AlShaher
Abstract:
In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm
Procedia PDF Downloads 1437059 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan
Abstract:
Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy
Procedia PDF Downloads 3077058 Predicting Survival in Cancer: How Cox Regression Model Compares to Artifial Neural Networks?
Authors: Dalia Rimawi, Walid Salameh, Amal Al-Omari, Hadeel AbdelKhaleq
Abstract:
Predication of Survival time of patients with cancer, is a core factor that influences oncologist decisions in different aspects; such as offered treatment plans, patients’ quality of life and medications development. For a long time proportional hazards Cox regression (ph. Cox) was and still the most well-known statistical method to predict survival outcome. But due to the revolution of data sciences; new predication models were employed and proved to be more flexible and provided higher accuracy in that type of studies. Artificial neural network is one of those models that is suitable to handle time to event predication. In this study we aim to compare ph Cox regression with artificial neural network method according to data handling and Accuracy of each model.Keywords: Cox regression, neural networks, survival, cancer.
Procedia PDF Downloads 1987057 Survival and Hazard Maximum Likelihood Estimator with Covariate Based on Right Censored Data of Weibull Distribution
Authors: Al Omari Mohammed Ahmed
Abstract:
This paper focuses on Maximum Likelihood Estimator with Covariate. Covariates are incorporated into the Weibull model. Under this regression model with regards to maximum likelihood estimator, the parameters of the covariate, shape parameter, survival function and hazard rate of the Weibull regression distribution with right censored data are estimated. The mean square error (MSE) and absolute bias are used to compare the performance of Weibull regression distribution. For the simulation comparison, the study used various sample sizes and several specific values of the Weibull shape parameter.Keywords: weibull regression distribution, maximum likelihood estimator, survival function, hazard rate, right censoring
Procedia PDF Downloads 4397056 Improved FP-Growth Algorithm with Multiple Minimum Supports Using Maximum Constraints
Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam
Abstract:
Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FP-growth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.Keywords: association rules, FP-growth, multiple minimum supports, Weka tool
Procedia PDF Downloads 4837055 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method
Authors: Luh Eka Suryani, Purhadi
Abstract:
Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion
Procedia PDF Downloads 1587054 Variable Selection in a Data Envelopment Analysis Model by Multiple Proportions Comparison
Authors: Jirawan Jitthavech, Vichit Lorchirachoonkul
Abstract:
A statistical procedure using multiple comparisons test for proportions is proposed for variable selection in a data envelopment analysis (DEA) model. The test statistic in the multiple comparisons is the proportion of efficient decision making units (DMUs) in a DEA model. Three methods of multiple comparisons test for proportions: multiple Z tests with Bonferroni correction, multiple tests in 2Xc crosstabulation and the Marascuilo procedure, are used in the proposed statistical procedure of iteratively eliminating the variables in a backward manner. Two simulation populations of moderately and lowly correlated variables are used to compare the results of the statistical procedure using three methods of multiple comparisons test for proportions with the hypothesis testing of the efficiency contribution measure. From the simulation results, it can be concluded that the proposed statistical procedure using multiple Z tests for proportions with Bonferroni correction clearly outperforms the proposed statistical procedure using the remaining two methods of multiple comparisons and the hypothesis testing of the efficiency contribution measure.Keywords: Bonferroni correction, efficient DMUs, Marascuilo procedure, Pastor et al. method, 2xc crosstabulation
Procedia PDF Downloads 3097053 Transport Related Air Pollution Modeling Using Artificial Neural Network
Authors: K. D. Sharma, M. Parida, S. S. Jain, Anju Saini, V. K. Katiyar
Abstract:
Air quality models form one of the most important components of an urban air quality management plan. Various statistical modeling techniques (regression, multiple regression and time series analysis) have been used to predict air pollution concentrations in the urban environment. These models calculate pollution concentrations due to observed traffic, meteorological and pollution data after an appropriate relationship has been obtained empirically between these parameters. Artificial neural network (ANN) is increasingly used as an alternative tool for modeling the pollutants from vehicular traffic particularly in urban areas. In the present paper, an attempt has been made to model traffic air pollution, specifically CO concentration using neural networks. In case of CO concentration, two scenarios were considered. First, with only classified traffic volume input and the second with both classified traffic volume and meteorological variables. The results showed that CO concentration can be predicted with good accuracy using artificial neural network (ANN).Keywords: air quality management, artificial neural network, meteorological variables, statistical modeling
Procedia PDF Downloads 5237052 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms
Authors: Neha Ahirwar
Abstract:
In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree
Procedia PDF Downloads 647051 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation
Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen
Abstract:
Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning
Procedia PDF Downloads 717050 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco
Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui
Abstract:
The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate
Procedia PDF Downloads 1867049 Sum Capacity with Regularized Channel Inversion in Multi-Antenna Downlink Systems under Equal Power Constraint
Authors: Attaullah Khawaja, Amna Shabbir
Abstract:
Channel inversion is one of the simplest techniques for multiuser downlink systems with single-antenna users. In this paper regularized channel inversion under equal power constraint in the multiuser multiple input multiple output (MU-MIMO) broadcast channels has been considered. Sum capacity with plain channel inversion also known as Zero Forcing Beam Forming (ZFBF) and optimum sum capacity using Dirty Paper Coding (DPC) has also been investigated. Analysis and simulations show that regularization enhances the system performance and empower linear growth in Sum Capacity and specially work well at low signal to noise ratio (SNRs) regime.Keywords: broadcast channel, channel inversion, multiple antenna multiple-user wireless, multiple-input multiple-output (MIMO), regularization, dirty paper coding (DPC), sum capacity
Procedia PDF Downloads 5267048 Multicollinearity and MRA in Sustainability: Application of the Raise Regression
Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez
Abstract:
Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.Keywords: multicollinearity, MRA, interaction, raise
Procedia PDF Downloads 1027047 Understanding the Linkages of Human Development and Fertility Change in Districts of Uttar Pradesh
Authors: Mamta Rajbhar, Sanjay K. Mohanty
Abstract:
India's progress in achieving replacement level of fertility is largely contingent on fertility reduction in the state of Uttar Pradesh as it accounts 17% of India's population with a low level of development. Though the TFR in the state has declined from 5.1 in 1991 to 3.4 by 2011, it conceals large differences in fertility level across districts. Using data from multiple sources this paper tests the hypothesis that the improvement in human development significantly reduces the fertility levels in districts of Uttar Pradesh. The unit of analyses is district, and fertility estimates are derived using the reverse survival method(RSM) while human development indices(HDI) are are estimated using uniform methodology adopted by UNDP for three period. The correlation and linear regression models are used to examine the relationship of fertility change and human development indices across districts. Result show the large variation and significant change in fertility level among the districts of Uttar Pradesh. During 1991-2011, eight districts had experienced a decline of TFR by 10-20%, 30 districts by 20-30% and 32 districts had experienced decline of more than 30%. On human development aspect, 17 districts recorded increase of more than 0.170 in HDI, 18 districts in the range of 0.150-0.170, 29 districts between 0.125-0.150 and six districts in the range of 0.1-0.125 during 1991-2011. Study shows significant negative relationship between HDI and TFR. HDI alone explains 70% variation in TFR. Also, the regression coefficient of TFR and HDI has become stronger over time; from -0.524 in 1991, -0.7477 by 2001 and -0.7181 by 2010. The regression analyses indicate that 0.1 point increase in HDI value will lead to 0.78 point decline in TFR. The HDI alone explains 70% variation in TFR. Improving the HDI will certainly reduce the fertility level in the districts.Keywords: Fertility, HDI, Uttar Pradesh
Procedia PDF Downloads 2487046 An Algebraic Geometric Imaging Approach for Automatic Dairy Cow Body Condition Scoring System
Authors: Thi Thi Zin, Pyke Tin, Ikuo Kobayashi, Yoichiro Horii
Abstract:
Today dairy farm experts and farmers have well recognized the importance of dairy cow Body Condition Score (BCS) since these scores can be used to optimize milk production, managing feeding system and as an indicator for abnormality in health even can be utilized to manage for having healthy calving times and process. In tradition, BCS measures are done by animal experts or trained technicians based on visual observations focusing on pin bones, pin, thurl and hook area, tail heads shapes, hook angles and short and long ribs. Since the traditional technique is very manual and subjective, the results can lead to different scores as well as not cost effective. Thus this paper proposes an algebraic geometric imaging approach for an automatic dairy cow BCS system. The proposed system consists of three functional modules. In the first module, significant landmarks or anatomical points from the cow image region are automatically extracted by using image processing techniques. To be specific, there are 23 anatomical points in the regions of ribs, hook bones, pin bone, thurl and tail head. These points are extracted by using block region based vertical and horizontal histogram methods. According to animal experts, the body condition scores depend mainly on the shape structure these regions. Therefore the second module will investigate some algebraic and geometric properties of the extracted anatomical points. Specifically, the second order polynomial regression is employed to a subset of anatomical points to produce the regression coefficients which are to be utilized as a part of feature vector in scoring process. In addition, the angles at thurl, pin, tail head and hook bone area are computed to extend the feature vector. Finally, in the third module, the extracted feature vectors are trained by using Markov Classification process to assign BCS for individual cows. Then the assigned BCS are revised by using multiple regression method to produce the final BCS score for dairy cows. In order to confirm the validity of proposed method, a monitoring video camera is set up at the milk rotary parlor to take top view images of cows. The proposed method extracts the key anatomical points and the corresponding feature vectors for each individual cows. Then the multiple regression calculator and Markov Chain Classification process are utilized to produce the estimated body condition score for each cow. The experimental results tested on 100 dairy cows from self-collected dataset and public bench mark dataset show very promising with accuracy of 98%.Keywords: algebraic geometric imaging approach, body condition score, Markov classification, polynomial regression
Procedia PDF Downloads 1567045 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models
Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini
Abstract:
The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion
Procedia PDF Downloads 1387044 Information Communication Technology (ICT) Using Management in Nursing College under the Praboromarajchanok Institute
Authors: Suphaphon Udomluck, Pannathorn Chachvarat
Abstract:
Information Communication Technology (ICT) using management is essential for effective decision making in organization. The Concerns Based Adoption Model (CBAM) was employed as the conceptual framework. The purposes of the study were to assess the situation of Information Communication Technology (ICT) using management in College of Nursing under the Praboromarajchanok Institute. The samples were multi – stage sampling of 10 colleges of nursing that participated include directors, vice directors, head of learning groups, teachers, system administrator and responsible for ICT. The total participants were 280; the instrument used were questionnaires that include 4 parts, general information, Information Communication Technology (ICT) using management, the Stage of concern Questionnaires (SoC), and the Levels of Use (LoU) ICT Questionnaires respectively. Reliability coefficients were tested; alpha coefficients were 0.967for Information Communication Technology (ICT) using management, 0.884 for SoC and 0.945 for LoU. The data were analyzed by frequency, percentage, mean, standard deviation, Pearson Product Moment Correlation and Multiple Regression. They were founded as follows: The high level overall score of Information Communication Technology (ICT) using management and issue were administration, hardware, software, and people. The overall score of the Stage of concern (SoC)ICTis at high level and the overall score of the Levels of Use (LoU) ICTis at moderate. The Information Communication Technology (ICT) using management had the positive relationship with the Stage of concern (SoC)ICTand the Levels of Use (LoU) ICT(p < .01). The results of Multiple Regression revealed that administration hardwear, software and people ware could predict SoC of ICT (18.5%) and LoU of ICT (20.8%).The factors that were significantly influenced by SoCs were people ware. The factors that were significantly influenced by LoU of ICT were administration hardware and people ware.Keywords: information communication technology (ICT), management, the concerns-based adoption model (CBAM), stage of concern(SoC), the levels of use(LoU)
Procedia PDF Downloads 3177043 Predictors of School Drop out among High School Students
Authors: Osman Zorbaz, Selen Demirtas-Zorbaz, Ozlem Ulas
Abstract:
The factors that cause adolescents to drop out school were several. One of the frameworks about school dropout focuses on the contextual factors around the adolescents whereas the other one focuses on individual factors. It can be said that both factors are important equally. In this study, both adolescent’s individual factors (anti-social behaviors, academic success) and contextual factors (parent academic involvement, parent academic support, number of siblings, living with parent) were examined in the term of school dropout. The study sample consisted of 346 high school students in the public schools in Ankara who continued their education in 2015-2016 academic year. One hundred eighty-five the students (53.5%) were girls and 161 (46.5%) were boys. In addition to this 118 of them were in ninth grade, 122 of them in tenth grade and 106 of them were in eleventh grade. Multiple regression and one-way ANOVA statistical methods were used. First, it was examined if the data meet the assumptions and conditions that are required for regression analysis. After controlling the assumptions, regression analysis was conducted. Parent academic involvement, parent academic support, number of siblings, anti-social behaviors, academic success variables were taken into the regression model and it was seen that parent academic involvement (t=-3.023, p < .01), anti-social behaviors (t=7.038, p < .001), and academic success (t=-3.718, p < .001) predicted school dropout whereas parent academic support (t=-1.403, p > .05) and number of siblings (t=-1.908, p > .05) didn’t. The model explained 30% of the variance (R=.557, R2=.300, F5,345=30.626, p < .001). In addition to this the variance, results showed there was no significant difference on high school students school dropout levels according to living with parents or not (F2;345=1.183, p > .05). Results discussed in the light of the literature and suggestion were made. As a result, academic involvement, academic success and anti-social behaviors will be considered as an important factors for preventing school drop-out.Keywords: adolescents, anti-social behavior, parent academic involvement, parent academic support, school dropout
Procedia PDF Downloads 2837042 Generalized Additive Model for Estimating Propensity Score
Authors: Tahmidul Islam
Abstract:
Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching
Procedia PDF Downloads 3657041 The Role of Psychological Factors in Prediction Academic Performance of Students
Authors: Hadi Molaei, Yasavoli Davoud, Keshavarz, Mozhde Poordana
Abstract:
The present study aimed was to prediction the academic performance based on academic motivation, self-efficacy and Resiliency in the students. The present study was descriptive and correlational. Population of the study consisted of all students in Arak schools in year 1393-94. For this purpose, the number of 304 schools students in Arak was selected using multi-stage cluster sampling. They all questionnaires, self-efficacy, Resiliency and academic motivation Questionnaire completed. Data were analyzed using Pearson correlation and multiple regressions. Pearson correlation showed academic motivation, self-efficacy, and Resiliency with academic performance had a positive and significant relationship. In addition, multiple regression analysis showed that the academic motivation, self-efficacy and Resiliency were predicted academic performance. Based on the findings could be conclude that in order to increase the academic performance and further progress of students must provide the ground to strengthen academic motivation, self-efficacy and Resiliency act on them.Keywords: academic motivation, self-efficacy, resiliency, academic performance
Procedia PDF Downloads 4947040 Education for Sustainable Development and Primary Education in China: A Case Study
Authors: Ronghui (Kevin) Zhou
Abstract:
This research intends to explore the enactment of Education for Sustainable Development (ESD), in term of the ESD concept, in primary schools in China, and investigate the factors that have positively or negatively impacted the outcome of ESD in urban primary schools in China. The proposed research question is: how is the ESD concept perceived and enacted by the local education stakeholders. This research is conducted in multiple primary schools in China and has questionnaired and interviewed multiple education stakeholders, including school principals, school teachers, and bureau from the municipal Ministry of Education. Factor analysis, regression analysis, and critical discourse analysis are adopted to interpret and analyze the data. The preliminary findings suggest that contested ESD definition, education system pressures, education policy enforcement, and power dynamics between stakeholders are the key factors that have determined to what degree is ESD enacted, and to what extent is ESD practiced in primary schools in China.Keywords: education for sustainable development, China, primary education, case study
Procedia PDF Downloads 1637039 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes
Authors: Frank Kuebler, Rolf Steinhilper
Abstract:
Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process
Procedia PDF Downloads 5237038 Logistic Regression Model versus Additive Model for Recurrent Event Data
Authors: Entisar A. Elgmati
Abstract:
Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event
Procedia PDF Downloads 6337037 Solving Dimensionality Problem and Finding Statistical Constructs on Latent Regression Models: A Novel Methodology with Real Data Application
Authors: Sergio Paez Moncaleano, Alvaro Mauricio Montenegro
Abstract:
This paper presents a novel statistical methodology for measuring and founding constructs in Latent Regression Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations on Item Response Theory (IRT). In addition, based on the fundamentals of submodel theory and with a convergence of many ideas of IRT, we propose an algorithm not just to solve the dimensionality problem (nowadays an open discussion) but a new research field that promises more fear and realistic qualifications for examiners and a revolution on IRT and educational research. In the end, the methodology is applied to a set of real data set presenting impressive results for the coherence, speed and precision. Acknowledgments: This research was financed by Colciencias through the project: 'Multidimensional Item Response Theory Models for Practical Application in Large Test Designed to Measure Multiple Constructs' and both authors belong to SICS Research Group from Universidad Nacional de Colombia.Keywords: item response theory, dimensionality, submodel theory, factorial analysis
Procedia PDF Downloads 3717036 The Relationships among Learning Emotion, Major Satisfaction, Learning Flow, and Academic Achievement in Medical School Students
Authors: S. J. Yune, S. Y. Lee, S. J. Im, B. S. Kam, S. Y. Baek
Abstract:
This study explored whether academic emotion, major satisfaction, and learning flow are associated with academic achievement in medical school. We know that emotion and affective factors are important factors in students' learning and performance. Emotion has taken the stage in much of contemporary educational psychology literature, no longer relegated to secondary status behind traditionally studied cognitive constructs. Medical school students (n=164) completed academic emotion, major satisfaction, and learning flow online survey. Academic performance was operationalized as students' average grade on two semester exams. For data analysis, correlation analysis, multiple regression analysis, hierarchical multiple regression analyses and ANOVA were conducted. The results largely confirmed the hypothesized relations among academic emotion, major satisfaction, learning flow and academic achievement. Positive academic emotion had a correlation with academic achievement (β=.191). Positive emotion had 8.5% explanatory power for academic achievement. Especially, sense of accomplishment had a significant impact on learning performance (β=.265). On the other hand, negative emotion, major satisfaction, and learning flow did not affect academic performance. Also, there were differences in sense of great (F=5.446, p=.001) and interest (F=2.78, p=.043) among positive emotion, boredom (F=3.55, p=.016), anger (F=4.346, p=.006), and petulance (F=3.779, p=.012) among negative emotion by grade. This study suggested that medical students' positive emotion was an important contributor to their academic achievement. At the same time, it is important to consider that some negative emotions can act to increase one’s motivation. Of particular importance is the notion that instructors can and should create learning environment that foster positive emotion for students. In doing so, instructors improve their chances of positively impacting students’ achievement emotions, as well as their subsequent motivation, learning, and performance. This result had an implication for medical educators striving to understand the personal emotional factors that influence learning and performance in medical training.Keywords: academic achievement, learning emotion, learning flow, major satisfaction
Procedia PDF Downloads 2677035 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression
Procedia PDF Downloads 1357034 An Analysis of the Effect of Sharia Financing and Work Relation Founding towards Non-Performing Financing in Islamic Banks in Indonesia
Authors: Muhammad Bahrul Ilmi
Abstract:
The purpose of this research is to analyze the influence of Islamic financing and work relation founding simultaneously and partially towards non-performing financing in Islamic banks. This research was regression quantitative field research, and had been done in Muammalat Indonesia Bank and Islamic Danamon Bank in 3 months. The populations of this research were 15 account officers of Muammalat Indonesia Bank and Islamic Danamon Bank in Surakarta, Indonesia. The techniques of collecting data used in this research were documentation, questionnaire, literary study and interview. Regression analysis result shows that Islamic financing and work relation founding simultaneously has positive and significant effect towards non performing financing of two Islamic Banks. It is obtained with probability value 0.003 which is less than 0.05 and F value 9.584. The analysis result of Islamic financing regression towards non performing financing shows the significant effect. It is supported by double linear regression analysis with probability value 0.001 which is less than 0.05. The regression analysis of work relation founding effect towards non-performing financing shows insignificant effect. This is shown in the double linear regression analysis with probability value 0.161 which is bigger than 0.05.Keywords: Syariah financing, work relation founding, non-performing financing (NPF), Islamic Bank
Procedia PDF Downloads 4307033 A Study on Inference from Distance Variables in Hedonic Regression
Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro
Abstract:
In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity
Procedia PDF Downloads 4517032 Forecasting of Grape Juice Flavor by Using Support Vector Regression
Authors: Ren-Jieh Kuo, Chun-Shou Huang
Abstract:
The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China
Procedia PDF Downloads 492