Search results for: multivariate logistic regression
3799 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco
Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui
Abstract:
The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate
Procedia PDF Downloads 1863798 The Theory behind Logistic Regression
Authors: Jan Henrik Wosnitza
Abstract:
The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression
Procedia PDF Downloads 4253797 Application Difference between Cox and Logistic Regression Models
Authors: Idrissa Kayijuka
Abstract:
The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio
Procedia PDF Downloads 4523796 Use of Multistage Transition Regression Models for Credit Card Income Prediction
Authors: Denys Osipenko, Jonathan Crook
Abstract:
Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability
Procedia PDF Downloads 4823795 MapReduce Logistic Regression Algorithms with RHadoop
Authors: Byung Ho Jung, Dong Hoon Lim
Abstract:
Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.Keywords: big data, logistic regression, MapReduce, RHadoop
Procedia PDF Downloads 2803794 Regression for Doubly Inflated Multivariate Poisson Distributions
Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta
Abstract:
Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios
Procedia PDF Downloads 1553793 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation
Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen
Abstract:
Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning
Procedia PDF Downloads 713792 Logistic Regression Model versus Additive Model for Recurrent Event Data
Authors: Entisar A. Elgmati
Abstract:
Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event
Procedia PDF Downloads 6323791 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies
Authors: Chen Li-Ching
Abstract:
The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression
Procedia PDF Downloads 4543790 Selection of Designs in Ordinal Regression Models under Linear Predictor Misspecification
Authors: Ishapathik Das
Abstract:
The purpose of this article is to find a method of comparing designs for ordinal regression models using quantile dispersion graphs in the presence of linear predictor misspecification. The true relationship between response variable and the corresponding control variables are usually unknown. Experimenter assumes certain form of the linear predictor of the ordinal regression models. The assumed form of the linear predictor may not be correct always. Thus, the maximum likelihood estimates (MLE) of the unknown parameters of the model may be biased due to misspecification of the linear predictor. In this article, the uncertainty in the linear predictor is represented by an unknown function. An algorithm is provided to estimate the unknown function at the design points where observations are available. The unknown function is estimated at all points in the design region using multivariate parametric kriging. The comparison of the designs are based on a scalar valued function of the mean squared error of prediction (MSEP) matrix, which incorporates both variance and bias of the prediction caused by the misspecification in the linear predictor. The designs are compared using quantile dispersion graphs approach. The graphs also visually depict the robustness of the designs on the changes in the parameter values. Numerical examples are presented to illustrate the proposed methodology.Keywords: model misspecification, multivariate kriging, multivariate logistic link, ordinal response models, quantile dispersion graphs
Procedia PDF Downloads 3903789 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities
Authors: Retius Chifurira
Abstract:
Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities
Procedia PDF Downloads 1983788 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan
Abstract:
Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy
Procedia PDF Downloads 3073787 Generalized Additive Model for Estimating Propensity Score
Authors: Tahmidul Islam
Abstract:
Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching
Procedia PDF Downloads 3653786 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health
Authors: Irfan Ahmad Afip
Abstract:
This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression
Procedia PDF Downloads 1133785 HIV Disclosure Status and Factors among Women to Their Sexual Partner in Victory plus, Yogyakarta, Indonesia
Authors: Dwi Kartika Rukmi, Miftafu Darussalam
Abstract:
Background: The disclosure of women’s HIV status toward their sexual partners is an important issue that should be regarded as one of the efforts to prevent and control the spread of HIV. Research on the disclosure of seropositive HIV status as well as women-related factors in Indonesia, especially Yogyakarta is only a few. Methods: This is a correlational descriptive research along with its cross-sectional approach on 329 women with HIV/AIDS at the Victory Plus NGO from June to July 2016. This research used a purposive sampling method and a questionnaire as the data collection technique. The bivariate analysis test was undertaken by using a chi-square and multivariate test along with a logistic regression. Result: The multivariate analysis and logistic regression show five independent variables related to the disclosure of seropositive HIV status of women with HIV/AIDS toward their sexual partners, namely ethnicity (aOR = 36,859; 95% CI; (6,544-207,616)) religion (aOR =0,255; 95%CI; (0,075-0,868)), discussion with partners prior to the HIV test (aOR =0,069; 95%CI; (0,065-0,438)) , types of sexual partners (aOR = 0.191; 95% CI; (0.082-0,445)) and knowledge on the partners’ HIV status (aOR = 0.036; 95% CI; (0.008-0.160)). The highest level of reason for seropositive HIV women not to be open about their partners’ status is the fear of being rejected by their partners and the environmental stigma of HIV AIDS disease. Conclusion: The disclosure of seropositive HIV status in women with HIV/AIDS in the Victory Plus NGO of Yogyakarta was 79.4% or classified as a high category with some related factors such as ethnicity, religion, discussion with partners prior to the HIV test, types of partners and knowledge on the partners’ HIV status.Keywords: women, HIV, disclosure, sexual partner
Procedia PDF Downloads 2593784 Credit Risk Prediction Based on Bayesian Estimation of Logistic Regression Model with Random Effects
Authors: Sami Mestiri, Abdeljelil Farhat
Abstract:
The aim of this current paper is to predict the credit risk of banks in Tunisia, over the period (2000-2005). For this purpose, two methods for the estimation of the logistic regression model with random effects: Penalized Quasi Likelihood (PQL) method and Gibbs Sampler algorithm are applied. By using the information on a sample of 528 Tunisian firms and 26 financial ratios, we show that Bayesian approach improves the quality of model predictions in terms of good classification as well as by the ROC curve result.Keywords: forecasting, credit risk, Penalized Quasi Likelihood, Gibbs Sampler, logistic regression with random effects, curve ROC
Procedia PDF Downloads 5403783 On Estimating the Headcount Index by Using the Logistic Regression Estimator
Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz, Francisco J. Blanco-Encomienda
Abstract:
The problem of estimating a proportion has important applications in the field of economics, and in general, in many areas such as social sciences. A common application in economics is the estimation of the headcount index. In this paper, we define the general headcount index as a proportion. Furthermore, we introduce a new quantitative method for estimating the headcount index. In particular, we suggest to use the logistic regression estimator for the problem of estimating the headcount index. Assuming a real data set, results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the traditional estimator of the headcount index.Keywords: poverty line, poor, risk of poverty, Monte Carlo simulations, sample
Procedia PDF Downloads 4203782 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay
Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari
Abstract:
Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.Keywords: model tree, CART, logistic regression, soil shear strength
Procedia PDF Downloads 1933781 Heart Attack Prediction Using Several Machine Learning Methods
Authors: Suzan Anwar, Utkarsh Goyal
Abstract:
Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest
Procedia PDF Downloads 1363780 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan
Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou
Abstract:
This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve
Procedia PDF Downloads 2883779 Modeling and Analysis Of Occupant Behavior On Heating And Air Conditioning Systems In A Higher Education And Vocational Training Building In A Mediterranean Climate
Authors: Abderrahmane Soufi
Abstract:
The building sector is the largest consumer of energy in France, accounting for 44% of French consumption. To reduce energy consumption and improve energy efficiency, France implemented an energy transition law targeting 40% energy savings by 2030 in the tertiary building sector. Building simulation tools are used to predict the energy performance of buildings but the reliability of these tools is hampered by discrepancies between the real and simulated energy performance of a building. This performance gap lies in the simplified assumptions of certain factors, such as the behavior of occupants on air conditioning and heating, which is considered deterministic when setting a fixed operating schedule and a fixed interior comfort temperature. However, the behavior of occupants on air conditioning and heating is stochastic, diverse, and complex because it can be affected by many factors. Probabilistic models are an alternative to deterministic models. These models are usually derived from statistical data and express occupant behavior by assuming a probabilistic relationship to one or more variables. In the literature, logistic regression has been used to model the behavior of occupants with regard to heating and air conditioning systems by considering univariate logistic models in residential buildings; however, few studies have developed multivariate models for higher education and vocational training buildings in a Mediterranean climate. Therefore, in this study, occupant behavior on heating and air conditioning systems was modeled using logistic regression. Occupant behavior related to the turn-on heating and air conditioning systems was studied through experimental measurements collected over a period of one year (June 2023–June 2024) in three classrooms occupied by several groups of students in engineering schools and professional training. Instrumentation was provided to collect indoor temperature and indoor relative humidity in 10-min intervals. Furthermore, the state of the heating/air conditioning system (off or on) and the set point were determined. The outdoor air temperature, relative humidity, and wind speed were collected as weather data. The number of occupants, age, and sex were also considered. Logistic regression was used for modeling an occupant turning on the heating and air conditioning systems. The results yielded a proposed model that can be used in building simulation tools to predict the energy performance of teaching buildings. Based on the first months (summer and early autumn) of the investigations, the results illustrate that the occupant behavior of the air conditioning systems is affected by the indoor relative humidity and temperature in June, July, and August and by the indoor relative humidity, temperature, and number of occupants in September and October. Occupant behavior was analyzed monthly, and univariate and multivariate models were developed.Keywords: occupant behavior, logistic regression, behavior model, mediterranean climate, air conditioning, heating
Procedia PDF Downloads 573778 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment
Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa
Abstract:
The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score
Procedia PDF Downloads 2643777 Hospital Malnutrition and its Impact on 30-day Mortality in Hospitalized General Medicine Patients in a Tertiary Hospital in South India
Authors: Vineet Agrawal, Deepanjali S., Medha R., Subitha L.
Abstract:
Background. Hospital malnutrition is a highly prevalent issue and is known to increase the morbidity, mortality, length of hospital stay, and cost of care. In India, studies on hospital malnutrition have been restricted to ICU, post-surgical, and cancer patients. We designed this study to assess the impact of hospital malnutrition on 30-day post-discharge and in-hospital mortality in patients admitted in the general medicine department, irrespective of diagnosis. Methodology. All patients aged above 18 years admitted in the medicine wards, excluding medico-legal cases, were enrolled in the study. Nutritional assessment was done within 72 h of admission, using Subjective Global Assessment (SGA), which classifies patients into three categories: Severely malnourished, Mildly/moderately malnourished, and Normal/well-nourished. Anthropometric measurements like Body Mass Index (BMI), Triceps skin-fold thickness (TSF), and Mid-upper arm circumference (MUAC) were also performed. Patients were followed-up during hospital stay and 30 days after discharge through telephonic interview, and their final diagnosis, comorbidities, and cause of death were noted. Multivariate logistic regression and cox regression model were used to determine if the nutritional status at admission independently impacted mortality at one month. Results. The prevalence of malnourishment by SGA in our study was 67.3% among 395 hospitalized patients, of which 155 patients (39.2%) were moderately malnourished, and 111 (28.1%) were severely malnourished. Of 395 patients, 61 patients (15.4%) expired, of which 30 died in the hospital, and 31 died within 1 month of discharge from hospital. On univariate analysis, malnourished patients had significantly higher morality (24.3% in 111 Cat C patients) than well-nourished patients (10.1% in 129 Cat A patients), with OR 9.17, p-value 0.007. On multivariate logistic regression, age and higher Charlson Comorbidity Index (CCI) were independently associated with mortality. Higher CCI indicates higher burden of comorbidities on admission, and the CCI in the expired patient group (mean=4.38) was significantly higher than that of the alive cohort (mean=2.85). Though malnutrition significantly contributed to higher mortality on univariate analysis, it was not an independent predictor of outcome on multivariate logistic regression. Length of hospitalisation was also longer in the malnourished group (mean= 9.4 d) compared to the well-nourished group (mean= 8.03 d) with a trend towards significance (p=0.061). None of the anthropometric measurements like BMI, MUAC, or TSF showed any association with mortality or length of hospitalisation. Inference. The results of our study highlight the issue of hospital malnutrition in medicine wards and reiterate that malnutrition contributes significantly to patient outcomes. We found that SGA performs better than anthropometric measurements in assessing under-nutrition. We are of the opinion that the heterogeneity of the study population by diagnosis was probably the primary reason why malnutrition by SGA was not found to be an independent risk factor for mortality. Strategies to identify high-risk patients at admission and treat malnutrition in the hospital and post-discharge are needed.Keywords: hospitalization outcome, length of hospital stay, mortality, malnutrition, subjective global assessment (SGA)
Procedia PDF Downloads 1483776 Mediterranean Diet, Duration of Admission and Mortality in Elderly, Hospitalized Patients: A Cross-Sectional Study
Authors: Christos Lampropoulos, Maria Konsta, Ifigenia Apostolou, Vicky Dradaki, Tamta Sirbilatze, Irini Dri, Christina Kordali, Vaggelis Lambas, Kostas Argyros, Georgios Mavras
Abstract:
Objectives: Mediterranean diet has been associated with lower incidence of cardiovascular disease and cancer. The purpose of our study was to examine the hypothesis that Mediterranean diet may protect against mortality and reduce admission duration in elderly, hospitalized patients. Methods: Sample population included 150 patients (78 men, 72 women, mean age 80±8.2). The following data were taken into account in analysis: anthropometric and laboratory data, dietary habits (MedDiet score), patients’ nutritional status [Mini Nutritional Assessment (MNA) score], physical activity (International Physical Activity Questionnaires, IPAQ), smoking status, cause and duration of current admission, medical history (co-morbidities, previous admissions). Primary endpoints were mortality (from admission until 6 months afterwards) and duration of admission, compared to national guidelines for closed consolidated medical expenses. Logistic regression and linear regression analysis were performed in order to identify independent predictors for mortality and admission duration difference respectively. Results: According to MNA, nutrition was normal in 54/150 (36%) of patients, 46/150 (30.7%) of them were at risk of malnutrition and the rest 50/150 (33.3%) were malnourished. After performing multivariate logistic regression analysis we found that the odds of death decreased 30% per each unit increase of MedDiet score (OR=0.7, 95% CI:0.6-0.8, p < 0.0001). Patients with cancer-related admission were 37.7 times more likely to die, compared to those with infection (OR=37.7, 95% CI:4.4-325, p=0.001). According to multivariate linear regression analysis, admission duration was inversely related to Mediterranean diet, since it is decreased 0.18 days on average for each unit increase of MedDiet score (b:-0.18, 95% CI:-0.33 - -0.035, p=0.02). Additionally, the duration of current admission increased on average 0.83 days for each previous hospital admission (b:0.83, 95% CI:0.5-1.16, p<0.0001). The admission duration of patients with cancer was on average 4.5 days higher than the patients who admitted due to infection (b:4.5, 95% CI:0.9-8, p=0.015). Conclusion: Mediterranean diet adequately protects elderly, hospitalized patients against mortality and reduces the duration of hospitalization.Keywords: Mediterranean diet, malnutrition, nutritional status, prognostic factors for mortality
Procedia PDF Downloads 3103775 Nuclear Fuel Safety Threshold Determined by Logistic Regression Plus Uncertainty
Authors: D. S. Gomes, A. T. Silva
Abstract:
Analysis of the uncertainty quantification related to nuclear safety margins applied to the nuclear reactor is an important concept to prevent future radioactive accidents. The nuclear fuel performance code may involve the tolerance level determined by traditional deterministic models producing acceptable results at burn cycles under 62 GWd/MTU. The behavior of nuclear fuel can simulate applying a series of material properties under irradiation and physics models to calculate the safety limits. In this study, theoretical predictions of nuclear fuel failure under transient conditions investigate extended radiation cycles at 75 GWd/MTU, considering the behavior of fuel rods in light-water reactors under reactivity accident conditions. The fuel pellet can melt due to the quick increase of reactivity during a transient. Large power excursions in the reactor are the subject of interest bringing to a treatment that is known as the Fuchs-Hansen model. The point kinetic neutron equations show similar characteristics of non-linear differential equations. In this investigation, the multivariate logistic regression is employed to a probabilistic forecast of fuel failure. A comparison of computational simulation and experimental results was acceptable. The experiments carried out use the pre-irradiated fuels rods subjected to a rapid energy pulse which exhibits the same behavior during a nuclear accident. The propagation of uncertainty utilizes the Wilk's formulation. The variables chosen as essential to failure prediction were the fuel burnup, the applied peak power, the pulse width, the oxidation layer thickness, and the cladding type.Keywords: logistic regression, reactivity-initiated accident, safety margins, uncertainty propagation
Procedia PDF Downloads 2893774 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach
Authors: Ekele Alih, Enejo Jalija
Abstract:
Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,Keywords: female genital mutilation, logistic regression, tau statistic, African society
Procedia PDF Downloads 2603773 Dietary Patterns and Hearing Loss in Older People
Authors: N. E. Gallagher, C. E. Neville, N. Lyner, J. Yarnell, C. C. Patterson, J. E. Gallacher, Y. Ben-Shlomo, A. Fehily, J. V. Woodside
Abstract:
Hearing loss is highly prevalent in older people and can reduce quality of life substantially. Emerging research suggests that potentially modifiable risk factors, including risk factors previously related to cardiovascular disease risk, may be associated with a decreased or increased incidence of hearing loss. This has prompted investigation into the possibility that certain nutrients, foods or dietary patterns may also be associated with incidence of hearing loss. The aim of this study was to determine any associations between dietary patterns and hearing loss in men enrolled in the Caerphilly study. The Caerphilly prospective cohort study began in 1979-1983 with recruitment of 2512 men aged 45-59 years. Dietary data was collected using a self-administered, semi-quantitative, 56-item food frequency questionnaire (FFQ) at baseline (1979-1983), and 7-day weighed food intake (WI) in a 30% sub-sample, while pure-tone unaided audiometric threshold was assessed at 0.5, 1, 2 and 4 kHz, between 1984 and 1988. Principal components analysis (PCA) was carried out to determine a posteriori dietary patterns and multivariate linear and logistic regression models were used to examine associations with hearing level (pure tone average (PTA) of frequencies 0.5, 1, 2 and 4 kHz in decibels (dB)) for linear regression and with hearing loss (PTA>25dB) for logistic regression. Three dietary patterns were determined using PCA on the FFQ data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, both linear and logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P<0.001) and linear regression analysis showed a significant association between the High sugar/Alcohol avoider pattern and hearing loss (P=0.04). Three similar dietary patterns were determined using PCA on the WI data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P=0.02) and a significant association between the Traditional pattern and hearing loss (P=0.04). A Healthy dietary pattern was found to be significantly inversely associated with hearing loss in middle-aged men in the Caerphilly study. Furthermore, a High sugar/Alcohol avoider pattern (FFQ) and a Traditional pattern (WI) were associated with poorer hearing levels. Consequently, the role of dietary factors in hearing loss remains to be fully established and warrants further investigation.Keywords: ageing, diet, dietary patterns, hearing loss
Procedia PDF Downloads 2293772 A New Method to Estimate the Low Income Proportion: Monte Carlo Simulations
Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz
Abstract:
Estimation of a proportion has many applications in economics and social studies. A common application is the estimation of the low income proportion, which gives the proportion of people classified as poor into a population. In this paper, we present this poverty indicator and propose to use the logistic regression estimator for the problem of estimating the low income proportion. Various sampling designs are presented. Assuming a real data set obtained from the European Survey on Income and Living Conditions, Monte Carlo simulation studies are carried out to analyze the empirical performance of the logistic regression estimator under the various sampling designs considered in this paper. Results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the customary estimator under the various sampling designs considered in this paper. The stratified sampling design can also provide more accurate results.Keywords: poverty line, risk of poverty, auxiliary variable, ratio method
Procedia PDF Downloads 4543771 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements
Authors: Ebru Turgal, Beyza Doganay Erdogan
Abstract:
Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data
Procedia PDF Downloads 2013770 Full Mini Nutritional Assessment Questionnaire and the Risk of Malnutrition and Mortality in Elderly, Hospitalized Patients: A Cross-Sectional Study
Authors: Christos E. Lampropoulos, Maria Konsta, Tamta Sirbilatze, Ifigenia Apostolou, Vicky Dradaki, Konstantina Panouria, Irini Dri, Christina Kordali, Vaggelis Lambas, Georgios Mavras
Abstract:
Objectives: Full Mini Nutritional Assessment (MNA) questionnaire is one of the most useful tools in diagnosis of malnutrition in hospitalized patients, which is related to increased morbidity and mortality. The purpose of our study was to assess the nutritional status of elderly, hospitalized patients and examine the hypothesis that MNA may predict mortality and extension of hospitalization. Methods: One hundred fifty patients (78 men, 72 women, mean age 80±8.2) were included in this cross-sectional study. The following data were taken into account in analysis: anthropometric and laboratory data, physical activity (International Physical Activity Questionnaires, IPAQ), smoking status, dietary habits, cause and duration of current admission, medical history (co-morbidities, previous admissions). Primary endpoints were mortality (from admission until 6 months afterwards) and duration of admission. The latter was compared to national guidelines for closed consolidated medical expenses. Logistic regression and linear regression analysis were performed in order to identify independent predictors for mortality and extended hospitalization respectively. Results: According to MNA, nutrition was normal in 54/150 (36%) of patients, 46/150 (30.7%) of them were at risk of malnutrition and the rest 50/150 (33.3%) were malnourished. After performing multivariate logistic regression analysis we found that the odds of death decreased 20% per each unit increase of full MNA score (OR=0.8, 95% CI 0.74-0.89, p < 0.0001). Patients who admitted due to cancer were 23 times more likely to die, compared to those with infection (OR=23, 95% CI 3.8-141.6, p=0.001). Similarly, patients who admitted due to stroke were 7 times more likely to die (OR=7, 95% CI 1.4-34.5, p=0.02), while these with all other causes of admission were less likely (OR=0.2, 95% CI 0.06-0.8, p=0.03), compared to patients with infection. According to multivariate linear regression analysis, each increase of unit of full MNA, decreased the admission duration on average 0.3 days (b:-0.3, 95% CI -0.45 - -0.15, p < 0.0001). Patients admitted due to cancer had on average 6.8 days higher extension of hospitalization, compared to those admitted for infection (b:6.8, 95% CI 3.2-10.3, p < 0.0001). Conclusion: Mortality and extension of hospitalization is significantly increased in elderly, malnourished patients. Full MNA score is a useful diagnostic tool of malnutrition.Keywords: duration of admission, malnutrition, mini nutritional assessment score, prognostic factors for mortality
Procedia PDF Downloads 312