Search results for: logistic regression model
18904 Application Difference between Cox and Logistic Regression Models
Authors: Idrissa Kayijuka
Abstract:
The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio
Procedia PDF Downloads 45518903 Logistic Regression Model versus Additive Model for Recurrent Event Data
Authors: Entisar A. Elgmati
Abstract:
Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event
Procedia PDF Downloads 63518902 The Theory behind Logistic Regression
Authors: Jan Henrik Wosnitza
Abstract:
The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression
Procedia PDF Downloads 42618901 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities
Authors: Retius Chifurira
Abstract:
Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities
Procedia PDF Downloads 20018900 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies
Authors: Chen Li-Ching
Abstract:
The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression
Procedia PDF Downloads 45618899 Generalized Additive Model for Estimating Propensity Score
Authors: Tahmidul Islam
Abstract:
Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching
Procedia PDF Downloads 36718898 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay
Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari
Abstract:
Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.Keywords: model tree, CART, logistic regression, soil shear strength
Procedia PDF Downloads 19718897 Use of Multistage Transition Regression Models for Credit Card Income Prediction
Authors: Denys Osipenko, Jonathan Crook
Abstract:
Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability
Procedia PDF Downloads 48718896 Credit Risk Prediction Based on Bayesian Estimation of Logistic Regression Model with Random Effects
Authors: Sami Mestiri, Abdeljelil Farhat
Abstract:
The aim of this current paper is to predict the credit risk of banks in Tunisia, over the period (2000-2005). For this purpose, two methods for the estimation of the logistic regression model with random effects: Penalized Quasi Likelihood (PQL) method and Gibbs Sampler algorithm are applied. By using the information on a sample of 528 Tunisian firms and 26 financial ratios, we show that Bayesian approach improves the quality of model predictions in terms of good classification as well as by the ROC curve result.Keywords: forecasting, credit risk, Penalized Quasi Likelihood, Gibbs Sampler, logistic regression with random effects, curve ROC
Procedia PDF Downloads 54218895 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation
Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen
Abstract:
Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning
Procedia PDF Downloads 7418894 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment
Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa
Abstract:
The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score
Procedia PDF Downloads 26618893 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan
Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou
Abstract:
This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve
Procedia PDF Downloads 29218892 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco
Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui
Abstract:
The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate
Procedia PDF Downloads 18818891 MapReduce Logistic Regression Algorithms with RHadoop
Authors: Byung Ho Jung, Dong Hoon Lim
Abstract:
Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.Keywords: big data, logistic regression, MapReduce, RHadoop
Procedia PDF Downloads 28518890 Heart Attack Prediction Using Several Machine Learning Methods
Authors: Suzan Anwar, Utkarsh Goyal
Abstract:
Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest
Procedia PDF Downloads 13818889 Breast Cancer Mortality and Comorbidities in Portugal: A Predictive Model Built with Real World Data
Authors: Cecília M. Antão, Paulo Jorge Nogueira
Abstract:
Breast cancer (BC) is the first cause of cancer mortality among Portuguese women. This retrospective observational study aimed at identifying comorbidities associated with BC female patients admitted to Portuguese public hospitals (2010-2018), investigating the effect of comorbidities on BC mortality rate, and building a predictive model using logistic regression. Results showed that the BC mortality in Portugal decreased in this period and reached 4.37% in 2018. Adjusted odds ratio indicated that secondary malignant neoplasms of liver, of bone and bone marrow, congestive heart failure, and diabetes were associated with an increased chance of dying from breast cancer. Although the Lisbon district (the most populated area) accounted for the largest percentage of BC patients, the logistic regression model showed that, besides patient’s age, being resident in Bragança, Castelo Branco, or Porto districts was directly associated with an increase of the mortality rate.Keywords: breast cancer, comorbidities, logistic regression, adjusted odds ratio
Procedia PDF Downloads 8718888 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast
Authors: Nabilah Filzah Mohd Radzuan, Andi Putra, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan
Abstract:
Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.Keywords: logistic regression, decisions tree, random forest, VAR model
Procedia PDF Downloads 44618887 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry
Authors: Deepika Christopher, Garima Anand
Abstract:
To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications
Procedia PDF Downloads 5718886 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan
Abstract:
Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy
Procedia PDF Downloads 30918885 An Information Matrix Goodness-of-Fit Test of the Conditional Logistic Model for Matched Case-Control Studies
Authors: Li-Ching Chen
Abstract:
The case-control design has been widely applied in clinical and epidemiological studies to investigate the association between risk factors and a given disease. The retrospective design can be easily implemented and is more economical over prospective studies. To adjust effects for confounding factors, methods such as stratification at the design stage and may be adopted. When some major confounding factors are difficult to be quantified, a matching design provides an opportunity for researchers to control the confounding effects. The matching effects can be parameterized by the intercepts of logistic models and the conditional logistic regression analysis is then adopted. This study demonstrates an information-matrix-based goodness-of-fit statistic to test the validity of the logistic regression model for matched case-control data. The asymptotic null distribution of this proposed test statistic is inferred. It needs neither to employ a simulation to evaluate its critical values nor to partition covariate space. The asymptotic power of this test statistic is also derived. The performance of the proposed method is assessed through simulation studies. An example of the real data set is applied to illustrate the implementation of the proposed method as well.Keywords: conditional logistic model, goodness-of-fit, information matrix, matched case-control studies
Procedia PDF Downloads 29218884 Logistic Regression Based Model for Predicting Students’ Academic Performance in Higher Institutions
Authors: Emmanuel Osaze Oshoiribhor, Adetokunbo MacGregor John-Otumu
Abstract:
In recent years, there has been a desire to forecast student academic achievement prior to graduation. This is to help them improve their grades, particularly for individuals with poor performance. The goal of this study is to employ supervised learning techniques to construct a predictive model for student academic achievement. Many academics have already constructed models that predict student academic achievement based on factors such as smoking, demography, culture, social media, parent educational background, parent finances, and family background, to name a few. This feature and the model employed may not have correctly classified the students in terms of their academic performance. This model is built using a logistic regression classifier with basic features such as the previous semester's course score, attendance to class, class participation, and the total number of course materials or resources the student is able to cover per semester as a prerequisite to predict if the student will perform well in future on related courses. The model outperformed other classifiers such as Naive bayes, Support vector machine (SVM), Decision Tree, Random forest, and Adaboost, returning a 96.7% accuracy. This model is available as a desktop application, allowing both instructors and students to benefit from user-friendly interfaces for predicting student academic achievement. As a result, it is recommended that both students and professors use this tool to better forecast outcomes.Keywords: artificial intelligence, ML, logistic regression, performance, prediction
Procedia PDF Downloads 9718883 On Estimating the Headcount Index by Using the Logistic Regression Estimator
Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz, Francisco J. Blanco-Encomienda
Abstract:
The problem of estimating a proportion has important applications in the field of economics, and in general, in many areas such as social sciences. A common application in economics is the estimation of the headcount index. In this paper, we define the general headcount index as a proportion. Furthermore, we introduce a new quantitative method for estimating the headcount index. In particular, we suggest to use the logistic regression estimator for the problem of estimating the headcount index. Assuming a real data set, results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the traditional estimator of the headcount index.Keywords: poverty line, poor, risk of poverty, Monte Carlo simulations, sample
Procedia PDF Downloads 42318882 Factors for Entry Timing Choices Using Principal Axis Factorial Analysis and Logistic Regression Model
Authors: C. M. Mat Isa, H. Mohd Saman, S. R. Mohd Nasir, A. Jaapar
Abstract:
International market expansion involves a strategic process of market entry decision through which a firm expands its operation from domestic to the international domain. Hence, entry timing choices require the needs to balance the early entry risks and the problems in losing opportunities as a result of late entry into a new market. Questionnaire surveys administered to 115 Malaysian construction firms operating in 51 countries worldwide have resulted in 39.1 percent response rate. Factor analysis was used to determine the most significant factors affecting entry timing choices of the firms to penetrate the international market. A logistic regression analysis used to examine the firms’ entry timing choices, indicates that the model has correctly classified 89.5 per cent of cases as late movers. The findings reveal that the most significant factor influencing the construction firms’ choices as late movers was the firm factor related to the firm’s international experience, resources, competencies and financing capacity. The study also offers valuable information to construction firms with intention to internationalize their businesses.Keywords: factors, early movers, entry timing choices, late movers, logistic regression model, principal axis factorial analysis, Malaysian construction firms
Procedia PDF Downloads 37818881 Applying the Regression Technique for Prediction of the Acute Heart Attack
Authors: Paria Soleimani, Arezoo Neshati
Abstract:
Myocardial infarction is one of the leading causes of death in the world. Some of these deaths occur even before the patient reaches the hospital. Myocardial infarction occurs as a result of impaired blood supply. Because the most of these deaths are due to coronary artery disease, hence the awareness of the warning signs of a heart attack is essential. Some heart attacks are sudden and intense, but most of them start slowly, with mild pain or discomfort, then early detection and successful treatment of these symptoms is vital to save them. Therefore, importance and usefulness of a system designing to assist physicians in the early diagnosis of the acute heart attacks is obvious. The purpose of this study is to determine how well a predictive model would perform based on the only patient-reportable clinical history factors, without using diagnostic tests or physical exams. This type of the prediction model might have application outside of the hospital setting to give accurate advice to patients to influence them to seek care in appropriate situations. For this purpose, the data were collected on 711 heart patients in Iran hospitals. 28 attributes of clinical factors can be reported by patients; were studied. Three logistic regression models were made on the basis of the 28 features to predict the risk of heart attacks. The best logistic regression model in terms of performance had a C-index of 0.955 and with an accuracy of 94.9%. The variables, severe chest pain, back pain, cold sweats, shortness of breath, nausea, and vomiting were selected as the main features.Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic regression
Procedia PDF Downloads 44918880 Binary Logistic Regression Model in Predicting the Employability of Senior High School Graduates
Authors: Cromwell F. Gopo, Joy L. Picar
Abstract:
This study aimed to predict the employability of senior high school graduates for S.Y. 2018- 2019 in the Davao del Norte Division through quantitative research design using the descriptive status and predictive approaches among the indicated parameters, namely gender, school type, academics, academic award recipient, skills, values, and strand. The respondents of the study were the 33 secondary schools offering senior high school programs identified through simple random sampling, which resulted in 1,530 cases of graduates’ secondary data, which were analyzed using frequency, percentage, mean, standard deviation, and binary logistic regression. Results showed that the majority of the senior high school graduates who come from large schools were females. Further, less than half of these graduates received any academic award in any semester. In general, the graduates’ performance in academics, skills, and values were proficient. Moreover, less than half of the graduates were not employed. Then, those who were employed were either contractual, casual, or part-time workers dominated by GAS graduates. Further, the predictors of employability were gender and the Information and Communications Technology (ICT) strand, while the remaining variables did not add significantly to the model. The null hypothesis had been rejected as the coefficients of the predictors in the binary logistic regression equation did not take the value of 0. After utilizing the model, it was concluded that Technical-Vocational-Livelihood (TVL) graduates except ICT had greater estimates of employability.Keywords: employability, senior high school graduates, Davao del Norte, Philippines
Procedia PDF Downloads 15218879 Minimizing the Impact of Covariate Detection Limit in Logistic Regression
Authors: Shahadut Hossain, Jacek Wesolowski, Zahirul Hoque
Abstract:
In many epidemiological and environmental studies covariate measurements are subject to the detection limit. In most applications, covariate measurements are usually truncated from below which is known as left-truncation. Because the measuring device, which we use to measure the covariate, fails to detect values falling below the certain threshold. In regression analyses, it causes inflated bias and inaccurate mean squared error (MSE) to the estimators. This paper suggests a response-based regression calibration method to correct the deleterious impact introduced by the covariate detection limit in the estimators of the parameters of simple logistic regression model. Compared to the maximum likelihood method, the proposed method is computationally simpler, and hence easier to implement. It is robust to the violation of distributional assumption about the covariate of interest. In producing correct inference, the performance of the proposed method compared to the other competing methods has been investigated through extensive simulations. A real-life application of the method is also shown using data from a population-based case-control study of non-Hodgkin lymphoma.Keywords: environmental exposure, detection limit, left truncation, bias, ad-hoc substitution
Procedia PDF Downloads 23618878 A Comparison of Smoothing Spline Method and Penalized Spline Regression Method Based on Nonparametric Regression Model
Authors: Autcha Araveeporn
Abstract:
This paper presents a study about a nonparametric regression model consisting of a smoothing spline method and a penalized spline regression method. We also compare the techniques used for estimation and prediction of nonparametric regression model. We tried both methods with crude oil prices in dollars per barrel and the Stock Exchange of Thailand (SET) index. According to the results, it is concluded that smoothing spline method performs better than that of penalized spline regression method.Keywords: nonparametric regression model, penalized spline regression method, smoothing spline method, Stock Exchange of Thailand (SET)
Procedia PDF Downloads 44018877 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach
Authors: Ekele Alih, Enejo Jalija
Abstract:
Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,Keywords: female genital mutilation, logistic regression, tau statistic, African society
Procedia PDF Downloads 26118876 Developing a Cybernetic Model of Interdepartmental Logistic Interactions in SME
Authors: Jonas Mayer, Kai-Frederic Seitz, Thorben Kuprat
Abstract:
In today’s competitive environment production’s logistic objectives such as ‘delivery reliability’ and ‘delivery time’ and distribution’s logistic objectives such as ‘service level’ and ‘delivery delay’ are attributed great importance. Especially for small and mid-sized enterprises (SME) attaining these objectives pose a key challenge. Within this context, one of the difficulties is that interactions between departments within the enterprise and their specific objectives are insufficiently taken into account and aligned. Interdepartmental independencies along with contradicting targets set within the different departments result in enterprises having sub-optimal logistic performance capability. This paper presents a research project which will systematically describe the interactions between departments and convert them into a quantifiable form.Keywords: department-specific actuating and control variables, interdepartmental interactions, cybernetic model, logistic objectives
Procedia PDF Downloads 37218875 Model Averaging for Poisson Regression
Authors: Zhou Jianhong
Abstract:
Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics
Procedia PDF Downloads 520