Search results for: mixed effect logistic regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32020

Search results for: mixed effect logistic regression model

31960 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 365
31959 Naïve Bayes: A Classical Approach for the Epileptic Seizures Recognition

Authors: Bhaveek Maini, Sanjay Dhanka, Surita Maini

Abstract:

Electroencephalography (EEG) is used to classify several epileptic seizures worldwide. It is a very crucial task for the neurologist to identify the epileptic seizure with manual EEG analysis, as it takes lots of effort and time. Human error is always at high risk in EEG, as acquiring signals needs manual intervention. Disease diagnosis using machine learning (ML) has continuously been explored since its inception. Moreover, where a large number of datasets have to be analyzed, ML is acting as a boon for doctors. In this research paper, authors proposed two different ML models, i.e., logistic regression (LR) and Naïve Bayes (NB), to predict epileptic seizures based on general parameters. These two techniques are applied to the epileptic seizures recognition dataset, available on the UCI ML repository. The algorithms are implemented on an 80:20 train test ratio (80% for training and 20% for testing), and the performance of the model was validated by 10-fold cross-validation. The proposed study has claimed accuracy of 81.87% and 95.49% for LR and NB, respectively.

Keywords: epileptic seizure recognition, logistic regression, Naïve Bayes, machine learning

Procedia PDF Downloads 61
31958 Loan Repayment Prediction Using Machine Learning: Model Development, Django Web Integration and Cloud Deployment

Authors: Seun Mayowa Sunday

Abstract:

Loan prediction is one of the most significant and recognised fields of research in the banking, insurance, and the financial security industries. Some prediction systems on the market include the construction of static software. However, due to the fact that static software only operates with strictly regulated rules, they cannot aid customers beyond these limitations. Application of many machine learning (ML) techniques are required for loan prediction. Four separate machine learning models, random forest (RF), decision tree (DT), k-nearest neighbour (KNN), and logistic regression, are used to create the loan prediction model. Using the anaconda navigator and the required machine learning (ML) libraries, models are created and evaluated using the appropriate measuring metrics. From the finding, the random forest performs with the highest accuracy of 80.17% which was later implemented into the Django framework. For real-time testing, the web application is deployed on the Alibabacloud which is among the top 4 biggest cloud computing provider. Hence, to the best of our knowledge, this research will serve as the first academic paper which combines the model development and the Django framework, with the deployment into the Alibaba cloud computing application.

Keywords: k-nearest neighbor, random forest, logistic regression, decision tree, django, cloud computing, alibaba cloud

Procedia PDF Downloads 137
31957 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 554
31956 Gender Estimation by Means of Quantitative Measurements of Foramen Magnum: An Analysis of CT Head Images

Authors: Thilini Hathurusinghe, Uthpalie Siriwardhana, W. M. Ediri Arachchi, Ranga Thudugala, Indeewari Herath, Gayani Senanayake

Abstract:

The foramen magnum is more prone to protect than other skeletal remains during high impact and severe disruptive injuries. Therefore, it is worthwhile to explore whether these measurements can be used to determine the human gender which is vital in forensic and anthropological studies. The idea was to find out the ability to use quantitative measurements of foramen magnum as an anatomical indicator for human gender estimation and to evaluate the gender-dependent variations of foramen magnum using quantitative measurements. Randomly selected 113 subjects who underwent CT head scans at Sri Jayawardhanapura General Hospital of Sri Lanka within a period of six months, were included in the study. The sample contained 58 males (48.76 ± 14.7 years old) and 55 females (47.04 ±15.9 years old). Maximum length of the foramen magnum (LFM), maximum width of the foramen magnum (WFM), minimum distance between occipital condyles (MnD) and maximum interior distance between occipital condyles (MxID) were measured. Further, AreaT and AreaR were also calculated. The gender was estimated using binomial logistic regression. The mean values of all explanatory variables (LFM, WFM, MnD, MxID, AreaT, and AreaR) were greater among male than female. All explanatory variables except MnD (p=0.669) were statistically significant (p < 0.05). Significant bivariate correlations were demonstrated by AreaT and AreaR with the explanatory variables. The results evidenced that WFM and MxID were the best measurements in predicting gender according to binomial logistic regression. The estimated model was: log (p/1-p) =10.391-0.136×MxID-0.231×WFM, where p is the probability of being a female. The classification accuracy given by the above model was 65.5%. The quantitative measurements of foramen magnum can be used as a reliable anatomical marker for human gender estimation in the Sri Lankan context.

Keywords: foramen magnum, forensic and anthropological studies, gender estimation, logistic regression

Procedia PDF Downloads 151
31955 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 305
31954 Semiparametric Regression Of Truncated Spline Biresponse On Farmer Loyalty And Attachment Modeling

Authors: Adji Achmad Rinaldo Fernandes

Abstract:

Regression analysis is a statistical method that is able to describe and predict causal relationships between individuals. Not all relationships have a known curve shape; often, there are relationship patterns that cannot be known in the shape of the curve; besides that, a cause can have an impact on more than one effect, so that between effects can also have a close relationship in it. Regression analysis that can be done to find out the relationship can be brought closer to the semiparametric regression of truncated spline biresponse. The purpose of this study is to examine the function estimator and determine the best model of truncated spline biresponse semiparametric regression. The results of the secondary data study showed that the best model with the highest order of quadratic and a maximum of two knots with a Goodness of fit value in the form of Adjusted R2 of 88.5%.

Keywords: biresponse, farmer attachment, farmer loyalty, truncated spline

Procedia PDF Downloads 41
31953 Behind Fuzzy Regression Approach: An Exploration Study

Authors: Lavinia B. Dulla

Abstract:

The exploration study of the fuzzy regression approach attempts to present that fuzzy regression can be used as a possible alternative to classical regression. It likewise seeks to assess the differences and characteristics of simple linear regression and fuzzy regression using the width of prediction interval, mean absolute deviation, and variance of residuals. Based on the simple linear regression model, the fuzzy regression approach is worth considering as an alternative to simple linear regression when the sample size is between 10 and 20. As the sample size increases, the fuzzy regression approach is not applicable to use since the assumption regarding large sample size is already operating within the framework of simple linear regression. Nonetheless, it can be suggested for a practical alternative when decisions often have to be made on the basis of small data.

Keywords: fuzzy regression approach, minimum fuzziness criterion, interval regression, prediction interval

Procedia PDF Downloads 302
31952 Comparison of the Logistic and the Gompertz Growth Functions Considering a Periodic Perturbation in the Model Parameters

Authors: Avan Al-Saffar, Eun-Jin Kim

Abstract:

Both the logistic growth model and the gompertz growth model are used to describe growth processes. Both models driven by perturbations in different cases are investigated using information theory as a useful measure of sustainability and the variability. Specifically, we study the effect of different oscillatory modulations in the system's parameters on the evolution of the system and Probability Density Function (PDF). We show the maintenance of the initial conditions for a long time. We offer Fisher information analysis in positive and/or negative feedback and explain its implications for the sustainability of population dynamics. We also display a finite amplitude solution due to the purely fluctuating growth rate whereas the periodic fluctuations in negative feedback can lead to break down the system's self-regulation with an exponentially growing solution. In the cases tested, the gompertz and logistic systems show similar behaviour in terms of information and sustainability although they develop differently in time.

Keywords: dynamical systems, fisher information, probability density function (pdf), sustainability

Procedia PDF Downloads 433
31951 Challenges in Achieving Profitability for MRO Companies in the Aviation Industry: An Analytical Approach

Authors: Nur Sahver Uslu, Ali̇ Hakan Büyüklü

Abstract:

Maintenance, Repair, and Overhaul (MRO) costs are significant in the aviation industry. On the other hand, companies that provide MRO services to the aviation industry but are not dominant in the sector, need to determine the right strategies for sustainable profitability in a competitive environment. This study examined the operational real data of a small medium enterprise (SME) MRO company where analytical methods are not widely applied. The company's customers were divided into two categories: airline companies and non-airline companies, and the variables that best explained profitability were analyzed with Logistic Regression for each category and the results were compared. First, data reduction was applied to the transformed variables that went through the data cleaning and preparation stages, and the variables to be included in the model were decided. The misclassification rates for the logistic regression results concerning both customer categories are similar, indicating consistent model performance across different segments. Less profit margin is obtained from airline customers, which can be explained by the variables part description, time to quotation (TTQ), turnaround time (TAT), manager, part cost, and labour cost. The higher profit margin obtained from non-airline customers is explained only by the variables part description, part cost, and labour cost. Based on the two models, it can be stated that it is significantly more challenging for the MRO company, which is the subject of our study, to achieve profitability from Airline customers. While operational processes and organizational structure also affect the profit from airline customers, only the type of parts and costs determine the profit for non-airlines.

Keywords: aircraft, aircraft components, aviation, data analytics, data science, gini index, maintenance, repair, and overhaul, MRO, logistic regression, profit, variable clustering, variable reduction

Procedia PDF Downloads 35
31950 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 84
31949 Optimization of Machine Learning Regression Results: An Application on Health Expenditures

Authors: Songul Cinaroglu

Abstract:

Machine learning regression methods are recommended as an alternative to classical regression methods in the existence of variables which are difficult to model. Data for health expenditure is typically non-normal and have a heavily skewed distribution. This study aims to compare machine learning regression methods by hyperparameter tuning to predict health expenditure per capita. A multiple regression model was conducted and performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression recorded when different hyperparameters are assigned. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon (ε) value for Support Vector Regression was determined as hyperparameters. Study results performed by using 'k' fold cross validation changed from 5 to 50, indicate the difference between machine learning regression results in terms of R², RMSE and MAE values that are statistically significant (p < 0.001). Study results reveal that Random Forest Regression (R² ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) outperforms other machine learning regression methods. It is highly advisable to use machine learning regression methods for modelling health expenditures.

Keywords: machine learning, lasso regression, random forest regression, support vector regression, hyperparameter tuning, health expenditure

Procedia PDF Downloads 226
31948 Modeling and Analysis Of Occupant Behavior On Heating And Air Conditioning Systems In A Higher Education And Vocational Training Building In A Mediterranean Climate

Authors: Abderrahmane Soufi

Abstract:

The building sector is the largest consumer of energy in France, accounting for 44% of French consumption. To reduce energy consumption and improve energy efficiency, France implemented an energy transition law targeting 40% energy savings by 2030 in the tertiary building sector. Building simulation tools are used to predict the energy performance of buildings but the reliability of these tools is hampered by discrepancies between the real and simulated energy performance of a building. This performance gap lies in the simplified assumptions of certain factors, such as the behavior of occupants on air conditioning and heating, which is considered deterministic when setting a fixed operating schedule and a fixed interior comfort temperature. However, the behavior of occupants on air conditioning and heating is stochastic, diverse, and complex because it can be affected by many factors. Probabilistic models are an alternative to deterministic models. These models are usually derived from statistical data and express occupant behavior by assuming a probabilistic relationship to one or more variables. In the literature, logistic regression has been used to model the behavior of occupants with regard to heating and air conditioning systems by considering univariate logistic models in residential buildings; however, few studies have developed multivariate models for higher education and vocational training buildings in a Mediterranean climate. Therefore, in this study, occupant behavior on heating and air conditioning systems was modeled using logistic regression. Occupant behavior related to the turn-on heating and air conditioning systems was studied through experimental measurements collected over a period of one year (June 2023–June 2024) in three classrooms occupied by several groups of students in engineering schools and professional training. Instrumentation was provided to collect indoor temperature and indoor relative humidity in 10-min intervals. Furthermore, the state of the heating/air conditioning system (off or on) and the set point were determined. The outdoor air temperature, relative humidity, and wind speed were collected as weather data. The number of occupants, age, and sex were also considered. Logistic regression was used for modeling an occupant turning on the heating and air conditioning systems. The results yielded a proposed model that can be used in building simulation tools to predict the energy performance of teaching buildings. Based on the first months (summer and early autumn) of the investigations, the results illustrate that the occupant behavior of the air conditioning systems is affected by the indoor relative humidity and temperature in June, July, and August and by the indoor relative humidity, temperature, and number of occupants in September and October. Occupant behavior was analyzed monthly, and univariate and multivariate models were developed.

Keywords: occupant behavior, logistic regression, behavior model, mediterranean climate, air conditioning, heating

Procedia PDF Downloads 62
31947 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 82
31946 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 48
31945 A Study of Classification Models to Predict Drill-Bit Breakage Using Degradation Signals

Authors: Bharatendra Rai

Abstract:

Cutting tools are widely used in manufacturing processes and drilling is the most commonly used machining process. Although drill-bits used in drilling may not be expensive, their breakage can cause damage to expensive work piece being drilled and at the same time has major impact on productivity. Predicting drill-bit breakage, therefore, is important in reducing cost and improving productivity. This study uses twenty features extracted from two degradation signals viz., thrust force and torque. The methodology used involves developing and comparing decision tree, random forest, and multinomial logistic regression models for classifying and predicting drill-bit breakage using degradation signals.

Keywords: degradation signal, drill-bit breakage, random forest, multinomial logistic regression

Procedia PDF Downloads 353
31944 Prediction of Coronary Artery Stenosis Severity Based on Machine Learning Algorithms

Authors: Yu-Jia Jian, Emily Chia-Yu Su, Hui-Ling Hsu, Jian-Jhih Chen

Abstract:

Coronary artery is the major supplier of myocardial blood flow. When fat and cholesterol are deposit in the coronary arterial wall, narrowing and stenosis of the artery occurs, which may lead to myocardial ischemia and eventually infarction. According to the World Health Organization (WHO), estimated 740 million people have died of coronary heart disease in 2015. According to Statistics from Ministry of Health and Welfare in Taiwan, heart disease (except for hypertensive diseases) ranked the second among the top 10 causes of death from 2013 to 2016, and it still shows a growing trend. According to American Heart Association (AHA), the risk factors for coronary heart disease including: age (> 65 years), sex (men to women with 2:1 ratio), obesity, diabetes, hypertension, hyperlipidemia, smoking, family history, lack of exercise and more. We have collected a dataset of 421 patients from a hospital located in northern Taiwan who received coronary computed tomography (CT) angiography. There were 300 males (71.26%) and 121 females (28.74%), with age ranging from 24 to 92 years, and a mean age of 56.3 years. Prior to coronary CT angiography, basic data of the patients, including age, gender, obesity index (BMI), diastolic blood pressure, systolic blood pressure, diabetes, hypertension, hyperlipidemia, smoking, family history of coronary heart disease and exercise habits, were collected and used as input variables. The output variable of the prediction module is the degree of coronary artery stenosis. The output variable of the prediction module is the narrow constriction of the coronary artery. In this study, the dataset was randomly divided into 80% as training set and 20% as test set. Four machine learning algorithms, including logistic regression, stepwise regression, neural network and decision tree, were incorporated to generate prediction results. We used area under curve (AUC) / accuracy (Acc.) to compare the four models, the best model is neural network, followed by stepwise logistic regression, decision tree, and logistic regression, with 0.68 / 79 %, 0.68 / 74%, 0.65 / 78%, and 0.65 / 74%, respectively. Sensitivity of neural network was 27.3%, specificity was 90.8%, stepwise Logistic regression sensitivity was 18.2%, specificity was 92.3%, decision tree sensitivity was 13.6%, specificity was 100%, logistic regression sensitivity was 27.3%, specificity 89.2%. From the result of this study, we hope to improve the accuracy by improving the module parameters or other methods in the future and we hope to solve the problem of low sensitivity by adjusting the imbalanced proportion of positive and negative data.

Keywords: decision support, computed tomography, coronary artery, machine learning

Procedia PDF Downloads 229
31943 Association Between Advanced Parental Age and Implantation Failure: A Prospective Cohort Study in Anhui, China

Authors: Jiaqian Yin, Ruoling Chen, David Churchill, Huijuan Zou, Peipei Guo, Chunmei Liang, Xiaoqing Peng, Zhikang Zhang, Weiju Zhou, Yunxia Cao

Abstract:

Purpose: This study aimed to explore the interaction of male and female age on implantation failure from in vitro fertilisation (IVF)/ intracytoplasmic sperm injection (ICSI) treatments in couples following their first cycles using the Anhui Maternal-Child Health Study (AMCHS). Methods: The AMCHS recruited 2042 infertile couples who were physically fit for in vitro fertilisation (IVF) or intracytoplasmic sperm injection (ICSI) treatment at the Reproductive Centre of the First Affiliated Hospital of Anhui Medical University between May 2017 to April 2021. This prospective cohort study analysed the data from 1910 cohort couples for the current paper data analysis. The multivariate logistic regression model was used to identify the effect of male and female age on implantation failure after controlling for confounding factors. Male age and female age were examined as continuous and categorical (male age: 20-<25, 25-<30, 30-<35, 35-<40, ≥40; female age: 20-<25, 25-<30, 30-<35, 35-<40, ≥40) predictors. Results: Logistic regression indicated that advanced maternal age was associated with increased implantation failure (P<0.001). There was evidence of an interaction between maternal age (30-<35 and ≥ 35) and paternal age (≥35) on implantation failure. (p<0.05). Only when the male was ≥35 years of increased maternal age was associated with the risk of implantation failure. Conclusion: In conclusion, there was an additive effect on implantation failure with advanced parental age. The impact of advanced maternal age was only seen in the older paternal age group. The delay of childbearing in both men and women will be a serious public issue that may contribute to a higher risk of implantation failure in patients needing assisted reproductive technology (ART).

Keywords: parental age, infertility, cohort study, IVF

Procedia PDF Downloads 155
31942 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

Authors: Shahadut Hossain

Abstract:

Measurement errors in continuous covariates and/or misclassifications in categorical covariates are common in epidemiological studies. Regression analysis ignoring such mismeasurements seriously biases the estimated main and interaction effects of covariates on the outcome of interest. Thus, adjustments for such mismeasurements are necessary. In this research, we propose a Bayesian parametric framework for eliminating deleterious impacts of covariate mismeasurements in logistic regression. The proposed adjustment method is unified and thus can be applied to any generalized linear and non-linear regression models. Furthermore, adjustment for covariate mismeasurements requires validation data usually in the form of either gold standard measurements or replicates of the mismeasured covariates on a subset of the study population. Initial investigation shows that adequacy of such adjustment depends on the sizes of main and validation samples, especially when prevalences of the categorical covariates are low. Thus, we investigate the impact of main and validation sample sizes on the adjusted estimates, and provide a general guideline about these sample sizes based on simulation studies.

Keywords: measurement errors, misclassification, mismeasurement, validation sample, Bayesian adjustment

Procedia PDF Downloads 409
31941 Urban Household Waste Disposal Modes and Their Determinants: Evidence from Bure Town, North-Western Ethiopia

Authors: Mastawal Melese, Yismaw Assefa

Abstract:

This study aims to identify household-level determinants of solid waste disposal (SWD) practices in Bure Town, north-western Ethiopia. Using a cross-sectional design and a mixed-methods approach, data were collected from 238 randomly selected households through structured interviews, focus group discussions, and field observations. Descriptive analysis revealed that 14.7% of households used composting as a primary SWD method, 37.4% practiced open dumping, 25.6% used burning, and 22.3% resorted to burial. Multinomial logistic regression showed that factors such as monthly income, age, family size, length of residence, sex, home ownership, solid waste sorting procedures, and education significantly influenced the choice of disposal method. Households with lower education, income, home ownership, and shorter residence times were more likely to use improper disposal methods. Females were found to be more likely to engage in better waste disposal practices than males. These findings underscore the need for context-specific interventions in newly developing towns to enhance household-level SWM systems by addressing key socio-economic factors.

Keywords: multinomial logistic regression, solid waste management, solid waste disposal, urban household

Procedia PDF Downloads 24
31940 An Exploratory Study on 'Sub-Region Life Circle' in Chinese Big Cities Based on Human High-Probability Daily Activity: Characteristic and Formation Mechanism as a Case of Wuhan

Authors: Zhuoran Shan, Li Wan, Xianchun Zhang

Abstract:

With an increasing trend of regionalization and polycentricity in Chinese contemporary big cities, “sub-region life circle” turns to be an effective method on rational organization of urban function and spatial structure. By the method of questionnaire, network big data, route inversion on internet map, GIS spatial analysis and logistic regression, this article makes research on characteristic and formation mechanism of “sub-region life circle” based on human high-probability daily activity in Chinese big cities. Firstly, it shows that “sub-region life circle” has been a new general spatial sphere of residents' high-probability daily activity and mobility in China. Unlike the former analysis of the whole metropolitan or the micro community, “sub-region life circle” has its own characteristic on geographical sphere, functional element, spatial morphology and land distribution. Secondly, according to the analysis result with Binary Logistic Regression Model, the research also shows that seven factors including land-use mixed degree and bus station density impact the formation of “sub-region life circle” most, and then analyzes the index critical value of each factor. Finally, to establish a smarter “sub-region life circle”, this paper indicates that several strategies including jobs-housing fit, service cohesion and space reconstruction are the keys for its spatial organization optimization. This study expands the further understanding of cities' inner sub-region spatial structure based on human daily activity, and contributes to the theory of “life circle” in urban's meso-scale.

Keywords: sub-region life circle, characteristic, formation mechanism, human activity, spatial structure

Procedia PDF Downloads 301
31939 Evaluating the Logistic Performance Capability of Regeneration Processes

Authors: Thorben Kuprat, Julian Becker, Jonas Mayer, Peter Nyhuis

Abstract:

For years now, it has been recognized that logistic performance capability contributes enormously to a production enterprise’s competitiveness and as such is a critical control lever. In doing so, the orientation on customer wishes (e.g. delivery dates) represents a key parameter not only in the value-adding production but also in product regeneration. Since production and regeneration processes have different characteristics, production planning and control measures cannot be directly transferred to regeneration processes. As part of a special research project, the Institute of Production Systems and Logistics Hannover is focused on increasing the logistic performance capability of regeneration processes for complex capital goods. The aim is to ensure logistic targets are met by implementing a model specifically designed to align the capacities and load in regeneration processes.

Keywords: capacity planning, complex capital goods, logistic performance, regeneration process

Procedia PDF Downloads 490
31938 Evaluation of a Piecewise Linear Mixed-Effects Model in the Analysis of Randomized Cross-over Trial

Authors: Moses Mwangi, Geert Verbeke, Geert Molenberghs

Abstract:

Cross-over designs are commonly used in randomized clinical trials to estimate efficacy of a new treatment with respect to a reference treatment (placebo or standard). The main advantage of using cross-over design over conventional parallel design is its flexibility, where every subject become its own control, thereby reducing confounding effect. Jones & Kenward, discuss in detail more recent developments in the analysis of cross-over trials. We revisit the simple piecewise linear mixed-effects model, proposed by Mwangi et. al, (in press) for its first application in the analysis of cross-over trials. We compared performance of the proposed piecewise linear mixed-effects model with two commonly cited statistical models namely, (1) Grizzle model; and (2) Jones & Kenward model, used in estimation of the treatment effect, in the analysis of randomized cross-over trial. We estimate two performance measurements (mean square error (MSE) and coverage probability) for the three methods, using data simulated from the proposed piecewise linear mixed-effects model. Piecewise linear mixed-effects model yielded lowest MSE estimates compared to Grizzle and Jones & Kenward models for both small (Nobs=20) and large (Nobs=600) sample sizes. It’s coverage probability were highest compared to Grizzle and Jones & Kenward models for both small and large sample sizes. A piecewise linear mixed-effects model is a better estimator of treatment effect than its two competing estimators (Grizzle and Jones & Kenward models) in the analysis of cross-over trials. The data generating mechanism used in this paper captures two time periods for a simple 2-Treatments x 2-Periods cross-over design. Its application is extendible to more complex cross-over designs with multiple treatments and periods. In addition, it is important to note that, even for single response models, adding more random effects increases the complexity of the model and thus may be difficult or impossible to fit in some cases.

Keywords: Evaluation, Grizzle model, Jones & Kenward model, Performance measures, Simulation

Procedia PDF Downloads 124
31937 Establishment of a Nomogram Prediction Model for Postpartum Hemorrhage during Vaginal Delivery

Authors: Yinglisong, Jingge Chen, Jingxuan Chen, Yan Wang, Hui Huang, Jing Zhnag, Qianqian Zhang, Zhenzhen Zhang, Ji Zhang

Abstract:

Purpose: The study aims to establish a nomogram prediction model for postpartum hemorrhage (PPH) in vaginal delivery. Patients and Methods: Clinical data were retrospectively collected from vaginal delivery patients admitted to a hospital in Zhengzhou, China, from June 1, 2022 - October 31, 2022. Univariate and multivariate logistic regression were used to filter out independent risk factors. A nomogram model was established for PPH in vaginal delivery based on the risk factors coefficient. Bootstrapping was used for internal validation. To assess discrimination and calibration, receiver operator characteristics (ROC) and calibration curves were generated in the derivation and validation groups. Results: A total of 1340 cases of vaginal delivery were enrolled, with 81 (6.04%) having PPH. Logistic regression indicated that history of uterine surgery, induction of labor, duration of first labor, neonatal weight, WBC value (during the first stage of labor), and cervical lacerations were all independent risk factors of hemorrhage (P <0.05). The area-under-curve (AUC) of ROC curves of the derivation group and the validation group were 0.817 and 0.821, respectively, indicating good discrimination. Two calibration curves showed that nomogram prediction and practical results were highly consistent (P = 0.105, P = 0.113). Conclusion: The developed individualized risk prediction nomogram model can assist midwives in recognizing and diagnosing high-risk groups of PPH and initiating early warning to reduce PPH incidence.

Keywords: vaginal delivery, postpartum hemorrhage, risk factor, nomogram

Procedia PDF Downloads 79
31936 The Effect of Lande G-Factors on the Quantum and Thermal Entanglement in the Mixed Spin-(1/2,S) Heisenberg Dimer

Authors: H. Vargova, J. Strecka, N. Tomasovicova

Abstract:

A rigorous analytical treatment, with the help of a concept of negativity, is used to study the quantum and thermal entanglement in an isotropic mixed spin-(1/2,S) Heisenberg dimer. The effect of the spin-S magnitude, as well as the effect of diversity between Landé g-factors of magnetic constituents on system entanglement, is exhaustively analyzed upon the variation of the external magnetic and electric field, respectively. It was identified that the increasing magnitude of the spin-S species in a mixed spin-(1/2,S) Heisenberg dimer with comparative Landé g-factors have always a reduction effect on a degree of the quantum entanglement, but it strikingly shifts the thermal entanglement to the higher temperatures. Surprisingly, out of the limit of identical Landé g-factors, the increasing magnitude of spin-S entities can enhance the system entanglement in both low and high magnetic fields. Besides this, we identify that the analyzed dimer with a high-enough magnitude of the spin-S entities at a sufficiently high magnetic field can exhibit unconventional thermally driven re-entrance between the entangled and unentangled mixed state. The importance of the electric-field stimuli is also discussed in detail.

Keywords: quantum and thermal entantanglement, mixed spin Heisenberg model, negativity, reentrant phase transition

Procedia PDF Downloads 101
31935 Determinants of Diarrhoea Prevalence Variations in Mountainous Informal Settlements of Kigali City, Rwanda

Authors: Dieudonne Uwizeye

Abstract:

Introduction: Diarrhoea is one of the major causes of morbidity and mortality among communities living in urban informal settlements of developing countries. It is assumed that mountainous environment introduces variations of the burden among residents of the same settlements. Design and Objective: A cross-sectional study was done in Kigali to explore the effect of mountainous informal settlements on diarrhoea risk variations. Data were collected among 1,152 households through household survey and transect walk to observe the status of sanitation. The outcome variable was the incidence of diarrhoea among household members of any age. The study used the most knowledgeable person in the household as the main respondent. Mostly this was the woman of the house as she was more likely to know the health status of every household member as she plays various roles: mother, wife, and head of the household among others. The analysis used cross tabulation and logistic regression analysis. Results: Results suggest that risks for diarrhoea vary depending on home location in the settlements. Diarrhoea risk increased as the distance from the road increased. The results of the logistic regression analysis indicate the adjusted odds ratio of 2.97 with 95% confidence interval being 1.35-6.55 and 3.50 adjusted odds ratio with 95% confidence interval being 1.61-7.60 in level two and three respectively compared with level one. The status of sanitation within and around homes was also significantly associated with the increase of diarrhoea. Equally, it is indicated that stable households were less likely to have diarrhoea. The logistic regression analysis indicated the adjusted odds ratio of 0.45 with 95% confidence interval being 0.25-0.81. However, the study did not find evidence for a significant association between diarrhoea risks and household socioeconomic status in the multivariable model. It is assumed that environmental factors in mountainous settings prevailed. Households using the available public water sources were more likely to have diarrhoea in their households. Recommendation: The study recommends the provision and extension of infrastructure for improved water, drainage, sanitation and wastes management facilities. Equally, studies should be done to identify the level of contamination and potential origin of contaminants for water sources in the valleys to adequately control the risks for diarrhoea in mountainous urban settings.

Keywords: urbanisation, diarrhoea risk, mountainous environment, urban informal settlements in Rwanda

Procedia PDF Downloads 173
31934 Effect of Drying on the Concrete Structures

Authors: A. Brahma

Abstract:

The drying of hydraulics materials is unavoidable and conducted to important spontaneous deformations. In this study, we show that it is possible to describe the drying shrinkage of the high-performance concrete by a simple expression. A multiple regression model was developed for the prediction of the drying shrinkage of the high-performance concrete. The assessment of the proposed model has been done by a set of statistical tests. The model developed takes in consideration the main parameters of confection and conservation. There was a very good agreement between drying shrinkage predicted by the multiple regression model and experimental results. The developed model adjusts easily to all hydraulic concrete types.

Keywords: hydraulic concretes, drying, shrinkage, prediction, modeling

Procedia PDF Downloads 368
31933 Nuclear Fuel Safety Threshold Determined by Logistic Regression Plus Uncertainty

Authors: D. S. Gomes, A. T. Silva

Abstract:

Analysis of the uncertainty quantification related to nuclear safety margins applied to the nuclear reactor is an important concept to prevent future radioactive accidents. The nuclear fuel performance code may involve the tolerance level determined by traditional deterministic models producing acceptable results at burn cycles under 62 GWd/MTU. The behavior of nuclear fuel can simulate applying a series of material properties under irradiation and physics models to calculate the safety limits. In this study, theoretical predictions of nuclear fuel failure under transient conditions investigate extended radiation cycles at 75 GWd/MTU, considering the behavior of fuel rods in light-water reactors under reactivity accident conditions. The fuel pellet can melt due to the quick increase of reactivity during a transient. Large power excursions in the reactor are the subject of interest bringing to a treatment that is known as the Fuchs-Hansen model. The point kinetic neutron equations show similar characteristics of non-linear differential equations. In this investigation, the multivariate logistic regression is employed to a probabilistic forecast of fuel failure. A comparison of computational simulation and experimental results was acceptable. The experiments carried out use the pre-irradiated fuels rods subjected to a rapid energy pulse which exhibits the same behavior during a nuclear accident. The propagation of uncertainty utilizes the Wilk's formulation. The variables chosen as essential to failure prediction were the fuel burnup, the applied peak power, the pulse width, the oxidation layer thickness, and the cladding type.

Keywords: logistic regression, reactivity-initiated accident, safety margins, uncertainty propagation

Procedia PDF Downloads 292
31932 Food Insecurity Determinants Amidst the Covid-19 Pandemic: An Insight from Huntsville, Texas

Authors: Peter Temitope Agboola

Abstract:

Food insecurity continues to affect a large number of U.S households during this coronavirus COVID-19 pandemic. The pandemic has threatened the livelihoods of people, making them vulnerable to severe hardship and has had an unanticipated impact on the U.S economy. This study attempts to identify the food insecurity status of households and the determinant factors driving household food insecurity. Additionally, it attempts to discover the mitigation measures adopted by households during the pandemic in the city of Huntsville, Texas. A structured online sample survey was used to collect data, with a household expenditures survey used in evaluating the food security status of the household. Most survey respondents disclosed that the COVID-19 pandemic had affected their life and source of income. Furthermore, the main analytical tool used for the study is descriptive statistics and logistic regression modeling. A logistic regression model was used to determine the factors responsible for food insecurity in the study area. The result revealed that most households in the study area are food secure, with the remainder being food insecure.

Keywords: food insecurity, household expenditure survey, COVID-19, coping strategies, food pantry

Procedia PDF Downloads 209
31931 Supervised-Component-Based Generalised Linear Regression with Multiple Explanatory Blocks: THEME-SCGLR

Authors: Bry X., Trottier C., Mortier F., Cornu G., Verron T.

Abstract:

We address component-based regularization of a Multivariate Generalized Linear Model (MGLM). A set of random responses Y is assumed to depend, through a GLM, on a set X of explanatory variables, as well as on a set T of additional covariates. X is partitioned into R conceptually homogeneous blocks X1, ... , XR , viewed as explanatory themes. Variables in each Xr are assumed many and redundant. Thus, Generalised Linear Regression (GLR) demands regularization with respect to each Xr. By contrast, variables in T are assumed selected so as to demand no regularization. Regularization is performed searching each Xr for an appropriate number of orthogonal components that both contribute to model Y and capture relevant structural information in Xr. We propose a very general criterion to measure structural relevance (SR) of a component in a block, and show how to take SR into account within a Fisher-scoring-type algorithm in order to estimate the model. We show how to deal with mixed-type explanatory variables. The method, named THEME-SCGLR, is tested on simulated data.

Keywords: Component-Model, Fisher Scoring Algorithm, GLM, PLS Regression, SCGLR, SEER, THEME

Procedia PDF Downloads 397