Search results for: logistic regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18281

Search results for: logistic regression model

17981 Association Between Short-term NOx Exposure and Asthma Exacerbations in East London: A Time Series Regression Model

Authors: Hajar Hajmohammadi, Paul Pfeffer, Anna De Simoni, Jim Cole, Chris Griffiths, Sally Hull, Benjamin Heydecker

Abstract:

Background: There is strong interest in the relationship between short-term air pollution exposure and human health. Most studies in this field focus on serious health effects such as death or hospital admission, but air pollution exposure affects many people with less severe impacts, such as exacerbations of respiratory conditions. A lack of quantitative analysis and inconsistent findings suggest improved methodology is needed to understand these effectsmore fully. Method: We developed a time series regression model to quantify the relationship between daily NOₓ concentration and Asthma exacerbations requiring oral steroids from primary care settings. Explanatory variables include daily NOₓ concentration measurements extracted from 8 available background and roadside monitoring stations in east London and daily ambient temperature extracted for London City Airport, located in east London. Lags of NOx concentrations up to 21 days (3 weeks) were used in the model. The dependent variable was the daily number of oral steroid courses prescribed for GP registered patients with asthma in east London. A mixed distribution model was then fitted to the significant lags of the regression model. Result: Results of the time series modelling showed a significant relationship between NOₓconcentrations on each day and the number of oral steroid courses prescribed in the following three weeks. In addition, the model using only roadside stations performs better than the model with a mixture of roadside and background stations.

Keywords: air pollution, time series modeling, public health, road transport

Procedia PDF Downloads 113
17980 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 141
17979 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 153
17978 Predicting Options Prices Using Machine Learning

Authors: Krishang Surapaneni

Abstract:

The goal of this project is to determine how to predict important aspects of options, including the ask price. We want to compare different machine learning models to learn the best model and the best hyperparameters for that model for this purpose and data set. Option pricing is a relatively new field, and it can be very complicated and intimidating, especially to inexperienced people, so we want to create a machine learning model that can predict important aspects of an option stock, which can aid in future research. We tested multiple different models and experimented with hyperparameter tuning, trying to find some of the best parameters for a machine-learning model. We tested three different models: a Random Forest Regressor, a linear regressor, and an MLP (multi-layer perceptron) regressor. The most important feature in this experiment is the ask price; this is what we were trying to predict. In the field of stock pricing prediction, there is a large potential for error, so we are unable to determine the accuracy of the models based on if they predict the pricing perfectly. Due to this factor, we determined the accuracy of the model by finding the average percentage difference between the predicted and actual values. We tested the accuracy of the machine learning models by comparing the actual results in the testing data and the predictions made by the models. The linear regression model performed worst, with an average percentage error of 17.46%. The MLP regressor had an average percentage error of 11.45%, and the random forest regressor had an average percentage error of 7.42%

Keywords: finance, linear regression model, machine learning model, neural network, stock price

Procedia PDF Downloads 54
17977 Determining Antecedents of Employee Turnover: A Study on Blue Collar vs White Collar Workers on Marco Level

Authors: Evy Rombaut, Marie-Anne Guerry

Abstract:

Predicting voluntary turnover of employees is an important topic of study, both in academia and industry. Researchers try to uncover determinants for a broader understanding and possible prevention of turnover. In the current study, we use a data set based approach to reveal determinants for turnover, differing for blue and white collar workers. Our data set based approach made it possible to study actual turnover for more than 500000 employees in 15692 Belgian corporations. We use logistic regression to calculate individual turnover probabilities and test the goodness of our model with the AUC (area under the ROC-curve) method. The results of the study confirm the relationship of known determinants to employee turnover such as age, seniority, pay and work distance. In addition, the study unravels unknown and verifies known differences between blue and white collar workers. It shows opposite relationships to turnover for gender, marital status, the number of children, nationality, and pay.

Keywords: employee turnover, blue collar, white collar, dataset analysis

Procedia PDF Downloads 247
17976 Estimating Anthropometric Dimensions for Saudi Males Using Artificial Neural Networks

Authors: Waleed Basuliman

Abstract:

Anthropometric dimensions are considered one of the important factors when designing human-machine systems. In this study, the estimation of anthropometric dimensions has been improved by using Artificial Neural Network (ANN) model that is able to predict the anthropometric measurements of Saudi males in Riyadh City. A total of 1427 Saudi males aged 6 to 60 years participated in measuring 20 anthropometric dimensions. These anthropometric measurements are considered important for designing the work and life applications in Saudi Arabia. The data were collected during eight months from different locations in Riyadh City. Five of these dimensions were used as predictors variables (inputs) of the model, and the remaining 15 dimensions were set to be the measured variables (Model’s outcomes). The hidden layers varied during the structuring stage, and the best performance was achieved with the network structure 6-25-15. The results showed that the developed Neural Network model was able to estimate the body dimensions of Saudi male population in Riyadh City. The network's mean absolute percentage error (MAPE) and the root mean squared error (RMSE) were found to be 0.0348 and 3.225, respectively. These results were found less, and then better, than the errors found in the literature. Finally, the accuracy of the developed neural network was evaluated by comparing the predicted outcomes with regression model. The ANN model showed higher coefficient of determination (R2) between the predicted and actual dimensions than the regression model.

Keywords: artificial neural network, anthropometric measurements, back-propagation

Procedia PDF Downloads 460
17975 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon

Procedia PDF Downloads 601
17974 The Relationship between Depression, HIV Stigma and Adherence to Antiretroviral Therapy among Adult Patients Living with HIV at a Tertiary Hospital in Durban, South Africa: The Mediating Roles of Self-Efficacy and Social Support

Authors: Muziwandile Luthuli

Abstract:

Although numerous factors predicting adherence to antiretroviral therapy (ART) among people living with HIV/AIDS (PLWHA) have been broadly studied on both regional and global level, up-to-date adherence of patients to ART remains an overarching, dynamic and multifaceted problem that needs to be investigated over time and across various contexts. There is a rarity of empirical data in the literature on interactive mechanisms by which psychosocial factors influence adherence to ART among PLWHA within the South African context. Therefore, this study was designed to investigate the relationship between depression, HIV stigma, and adherence to ART among adult patients living with HIV at a tertiary hospital in Durban, South Africa, and the mediating roles of self-efficacy and social support. The health locus of control theory and the social support theory were the underlying theoretical frameworks for this study. Using a cross-sectional research design, a total of 201 male and female adult patients aged between 18-75 years receiving ART at a tertiary hospital in Durban, KwaZulu-Natal were sampled, using time location sampling (TLS). A self-administered questionnaire was employed to collect the data in this study. Data were analysed through SPSS version 27. Several statistical analyses were conducted in this study, namely univariate statistical analysis, correlational analysis, Pearson’s chi-square analysis, cross-tabulation analysis, binary logistic regression analysis, and mediational analysis. Univariate analysis indicated that the sample mean age was 39.28 years (SD=12.115), while most participants were females 71.0% (n=142), never married 74.2% (n=147), and most were also secondary school educated 48.3% (n=97), as well as unemployed 65.7% (n=132). The prevalence rate of participants who had high adherence to ART was 53.7% (n=108), and 46.3% (n=93) of participants had low adherence to ART. Chi-square analysis revealed that employment status was the only statistically significant socio-demographic influence of adherence to ART in this study (χ2 (3) = 8.745; p < .033). Chi-square analysis showed that there was a statistically significant difference found between depression and adherence to ART (χ2 (4) = 16.140; p < .003), while between HIV stigma and adherence to ART, no statistically significant difference was found (χ2 (1) = .323; p >.570). Binary logistic regression indicated that depression was statistically associated with adherence to ART (OR= .853; 95% CI, .789–.922, P < 001), while the association between self-efficacy and adherence to ART was statistically significant (OR= 1.04; 95% CI, 1.001– 1.078, P < .045) after controlling for the effect of depression. However, the findings showed that the effect of depression on adherence to ART was not significantly mediated by self-efficacy (Sobel test for indirect effect, Z= 1.01, P > 0.31). Binary logistic regression showed that the effect of HIV stigma on adherence to ART was not statistically significant (OR= .980; 95% CI, .937– 1.025, P > .374), but the effect of social support on adherence to ART was statistically significant, only after the effect of HIV stigma was controlled for (OR= 1.017; 95% CI, 1.000– 1.035, P < .046). This study promotes behavioral and social change effected through evidence-based interventions by emphasizing the need for additional research that investigates the interactive mechanisms by which psychosocial factors influence adherence to ART. Depression is a significant predictor of adherence to ART. Thus, to alleviate the psychosocial impact of depression on adherence to ART, effective interventions must be devised, along with special consideration of self-efficacy and social support. Therefore, this study is helpful in informing and effecting change in health policy and healthcare services through its findings

Keywords: ART adherence, depression, HIV/AIDS, PLWHA

Procedia PDF Downloads 159
17973 An Efficient Discrete Chaos in Generalized Logistic Maps with Applications in Image Encryption

Authors: Ashish Ashish

Abstract:

In the last few decades, the discrete chaos of difference equations has gained a massive attention of academicians and scholars due to its tremendous applications in each and every branch of science, such as cryptography, traffic control models, secure communications, weather forecasting, and engineering. In this article, a generalized logistic discrete map is established and discrete chaos is reported through period doubling bifurcation, period three orbit and Lyapunov exponent. It is interesting to see that the generalized logistic map exhibits superior chaos due to the presence of an extra degree of freedom of an ordered parameter. The period doubling bifurcation and Lyapunov exponent are demonstrated for some particular values of parameter and the discrete chaos is determined in the sense of Devaney's definition of chaos theoretically as well as numerically. Moreover, the study discusses an extended chaos based image encryption and decryption scheme in cryptography using this novel system. Surprisingly, a larger key space for coding and more sensitive dependence on initial conditions are examined for encryption and decryption of text messages, images and videos which secure the system strongly from external cyber attacks, coding attacks, statistic attacks and differential attacks.

Keywords: chaos, period-doubling, logistic map, Lyapunov exponent, image encryption

Procedia PDF Downloads 112
17972 Assessment of Level of Sedation and Associated Factors Among Intubated Critically Ill Children in Pediatric Intensive Care Unit of Jimma University Medical Center: A Fourteen Months Prospective Observation Study, 2023

Authors: Habtamu Wolde Engudai

Abstract:

Background: Sedation can be provided to facilitate a procedure or to stabilize patients admitted in pediatric intensive care unit (PICU). Sedation is often necessary to maintain optimal care for critically ill children requiring mechanical ventilation. However, if sedation is too deep or too light, it has its own adverse effects, and hence, it is important to monitor the level of sedation and maintain an optimal level. Objectives: The objective is to assess the level of sedation and associated factors among intubated critically ill children admitted to PICU of JUMC, Jimma. Methods: A prospective observation study was conducted in the PICU of JUMC in September 2021 in 105 patients who were going to be admitted to the PICU aged less than 14 and with GCS >8. Data was collected by residents and nurses working in PICU. Data entry was done by Epi data manager (version 4.6.0.2). Statistical analysis and the creation of charts is going to be performed using SPSS version 26. Data was presented as mean, percentage and standard deviation. The assumption of logistic regression and the result of the assumption will be checked. To find potential predictors, bi-variable logistic regression was used for each predictor and outcome variable. A p value of <0.05 was considered as statistically significant. Finally, findings have been presented using figures, AOR, percentages, and a summary table. Result: in this study, 105 critically ill children had been involved who were started on continuous or intermittent forms of sedative drugs. Sedation level was assessed using a comfort scale three times per day. Based on this observation, we got a 44.8% level of suboptimal sedation at the baseline, a 36.2% level of suboptimal sedation at eight hours, and a 24.8% level of suboptimal sedation at sixteen hours. There is a significant association between suboptimal sedation and duration of stay with mechanical ventilation and the rate of unplanned extubation, which was shown by P < 0.05 using the Hosmer-Lemeshow test of goodness of fit (p> 0.44).

Keywords: level of sedation, critically ill children, Pediatric intensive care unit, Jimma university

Procedia PDF Downloads 27
17971 On Differential Growth Equation to Stochastic Growth Model Using Hyperbolic Sine Function in Height/Diameter Modeling of Pines

Authors: S. O. Oyamakin, A. U. Chukwu

Abstract:

Richard's growth equation being a generalized logistic growth equation was improved upon by introducing an allometric parameter using the hyperbolic sine function. The integral solution to this was called hyperbolic Richard's growth model having transformed the solution from deterministic to a stochastic growth model. Its ability in model prediction was compared with the classical Richard's growth model an approach which mimicked the natural variability of heights/diameter increment with respect to age and therefore provides a more realistic height/diameter predictions using the coefficient of determination (R2), Mean Absolute Error (MAE) and Mean Square Error (MSE) results. The Kolmogorov-Smirnov test and Shapiro-Wilk test was also used to test the behavior of the error term for possible violations. The mean function of top height/Dbh over age using the two models under study predicted closely the observed values of top height/Dbh in the hyperbolic Richard's nonlinear growth models better than the classical Richard's growth model.

Keywords: height, Dbh, forest, Pinus caribaea, hyperbolic, Richard's, stochastic

Procedia PDF Downloads 443
17970 Internet Addiction among Students: An Empirical Study in Pondicherry University

Authors: Mashood C., Abdul Vahid K., Ashique C. K.

Abstract:

The technology is growing beyond human expectation. Internet is one of very sophisticated product of the information technology. It has various advantages like connecting the world, simplifying the difficult tasks done in past etc. Simultaneously it has demerits also; that is lack of authenticity and internet addiction. To find out the problems of internet addiction, a study conducted among the Postgraduate students of Pondicherry University and collected 454 samples. The study strictly focused to identify the internet addiction among students, influence and interdependence of personality on internet addiction among first years and second years. To evaluate this, we used two major analysis, these are Confirmatory Factor Analysis (CFA) to predict the internet addiction with the observed data and Logistic Regression to identify the difference between first years and second years in the case of internet addiction. Before applying to the core analysis, the data applied to some preliminary tests to check the model fit. The empirical findings shows that , the students of Pondicherry University are very much addicted to the internet, But there is no such huge difference between first years and second years in case of internet addiction.

Keywords: internet addiction, students, Pondicherry University, empirical study

Procedia PDF Downloads 436
17969 Factors Associated with Recruitment and Adherence for Virtual Mindfulness Interventions in Youths

Authors: Kimberly Belfry, Shavon Stafford, Fariha Chowdhury, Jennifer Crawford, Soyeon Kim

Abstract:

Intervention programs are mostly delivered online during the pandemic. Screen fatigue has become a significant deterrent for virtually-deliveredinterventions, and thus, we aimed to examine factors associated with recruitment and adherence toan online mindfulness program for youths. Our preliminary analysis indicated that 40% of interested youths enrolled in the program. No difference in gender and age was found for those enrolled in the program. Adherence rate was approximately 25%, which warrants further examination. Grounding on the preliminary findings, we will conduct a binary logistic regression analysis to identify elements associated with recruitment and adherence. The model will include predictors such as age, sex, recruiter, mental health status, time of the year. Odds ratios and 95% CI will be reported. Our preliminary analysis showed low recruitment and adherence rate. By identifying elements associated with recruitment and adherence, our study provides transferrable information that can improve recruitment and adherence of online-delivered interventions offered during the pandemic.

Keywords: virtual interventions, recruitment, youth, mindfulness

Procedia PDF Downloads 114
17968 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 276
17967 Factors Related with Self-Care Behaviors among Iranian Type 2 Diabetic Patients: An Application of Health Belief Model

Authors: Ali Soroush, Mehdi Mirzaei Alavijeh, Touraj Ahmadi Jouybari, Fazel Zinat-Motlagh, Abbas Aghaei, Mari Ataee

Abstract:

Diabetes is a disease with long cardiovascular, renal, ophthalmic and neural complications. It is prevalent all around the world including Iran, and its prevalence is increasing. The aim of this study was to determine the factors related to self-care behavior based on health belief model among sample of Iranian diabetic patients. This cross-sectional study was conducted among 301 type 2 diabetic patients in Gachsaran, Iran. Data collection was based on an interview and the data were analyzed by SPSS version 20 using ANOVA, t-tests, Pearson correlation, and linear regression statistical tests at 95% significant level. Linear regression analyses showed the health belief model variables accounted for 29% of the variation in self-care behavior; and perceived severity and perceived self-efficacy are more influential predictors on self-care behavior among diabetic patients.

Keywords: diabetes, patients, self-care behaviors, health belief model

Procedia PDF Downloads 438
17966 Robust Shrinkage Principal Component Parameter Estimator for Combating Multicollinearity and Outliers’ Problems in a Poisson Regression Model

Authors: Arum Kingsley Chinedu, Ugwuowo Fidelis Ifeanyi, Oranye Henrietta Ebele

Abstract:

The Poisson regression model (PRM) is a nonlinear model that belongs to the exponential family of distribution. PRM is suitable for studying count variables using appropriate covariates and sometimes experiences the problem of multicollinearity in the explanatory variables and outliers on the response variable. This study aims to address the problem of multicollinearity and outliers jointly in a Poisson regression model. We developed an estimator called the robust modified jackknife PCKL parameter estimator by combining the principal component estimator, modified jackknife KL and transformed M-estimator estimator to address both problems in a PRM. The superiority conditions for this estimator were established, and the properties of the estimator were also derived. The estimator inherits the characteristics of the combined estimators, thereby making it efficient in addressing both problems. And will also be of immediate interest to the research community and advance this study in terms of novelty compared to other studies undertaken in this area. The performance of the estimator (robust modified jackknife PCKL) with other existing estimators was compared using mean squared error (MSE) as a performance evaluation criterion through a Monte Carlo simulation study and the use of real-life data. The results of the analytical study show that the estimator outperformed other existing estimators compared with by having the smallest MSE across all sample sizes, different levels of correlation, percentages of outliers and different numbers of explanatory variables.

Keywords: jackknife modified KL, outliers, multicollinearity, principal component, transformed M-estimator.

Procedia PDF Downloads 21
17965 Landslide Susceptibility Mapping Using Soft Computing in Amhara Saint

Authors: Semachew M. Kassa, Africa M Geremew, Tezera F. Azmatch, Nandyala Darga Kumar

Abstract:

Frequency ratio (FR) and analytical hierarchy process (AHP) methods are developed based on past landslide failure points to identify the landslide susceptibility mapping because landslides can seriously harm both the environment and society. However, it is still difficult to select the most efficient method and correctly identify the main driving factors for particular regions. In this study, we used fourteen landslide conditioning factors (LCFs) and five soft computing algorithms, including Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), Artificial Neural Network (ANN), and Naïve Bayes (NB), to predict the landslide susceptibility at 12.5 m spatial scale. The performance of the RF (F1-score: 0.88, AUC: 0.94), ANN (F1-score: 0.85, AUC: 0.92), and SVM (F1-score: 0.82, AUC: 0.86) methods was significantly better than the LR (F1-score: 0.75, AUC: 0.76) and NB (F1-score: 0.73, AUC: 0.75) method, according to the classification results based on inventory landslide points. The findings also showed that around 35% of the study region was made up of places with high and very high landslide risk (susceptibility greater than 0.5). The very high-risk locations were primarily found in the western and southeastern regions, and all five models showed good agreement and similar geographic distribution patterns in landslide susceptibility. The towns with the highest landslide risk include Amhara Saint Town's western part, the Northern part, and St. Gebreal Church villages, with mean susceptibility values greater than 0.5. However, rainfall, distance to road, and slope were typically among the top leading factors for most villages. The primary contributing factors to landslide vulnerability were slightly varied for the five models. Decision-makers and policy planners can use the information from our study to make informed decisions and establish policies. It also suggests that various places should take different safeguards to reduce or prevent serious damage from landslide events.

Keywords: artificial neural network, logistic regression, landslide susceptibility, naïve Bayes, random forest, support vector machine

Procedia PDF Downloads 37
17964 A Regression Model for Predicting Sugar Crystal Size in a Fed-Batch Vacuum Evaporative Crystallizer

Authors: Sunday B. Alabi, Edikan P. Felix, Aniediong M. Umo

Abstract:

Crystal size distribution is of great importance in the sugar factories. It determines the market value of granulated sugar and also influences the cost of production of sugar crystals. Typically, sugar is produced using fed-batch vacuum evaporative crystallizer. The crystallization quality is examined by crystal size distribution at the end of the process which is quantified by two parameters: the average crystal size of the distribution in the mean aperture (MA) and the width of the distribution of the coefficient of variation (CV). Lack of real-time measurement of the sugar crystal size hinders its feedback control and eventual optimisation of the crystallization process. An attractive alternative is to use a soft sensor (model-based method) for online estimation of the sugar crystal size. Unfortunately, the available models for sugar crystallization process are not suitable as they do not contain variables that can be measured easily online. The main contribution of this paper is the development of a regression model for estimating the sugar crystal size as a function of input variables which are easy to measure online. This has the potential to provide real-time estimates of crystal size for its effective feedback control. Using 7 input variables namely: initial crystal size (Lo), temperature (T), vacuum pressure (P), feed flowrate (Ff), steam flowrate (Fs), initial super-saturation (S0) and crystallization time (t), preliminary studies were carried out using Minitab 14 statistical software. Based on the existing sugar crystallizer models, and the typical ranges of these 7 input variables, 128 datasets were obtained from a 2-level factorial experimental design. These datasets were used to obtain a simple but online-implementable 6-input crystal size model. It seems the initial crystal size (Lₒ) does not play a significant role. The goodness of the resulting regression model was evaluated. The coefficient of determination, R² was obtained as 0.994, and the maximum absolute relative error (MARE) was obtained as 4.6%. The high R² (~1.0) and the reasonably low MARE values are an indication that the model is able to predict sugar crystal size accurately as a function of the 6 easy-to-measure online variables. Thus, the model can be used as a soft sensor to provide real-time estimates of sugar crystal size during sugar crystallization process in a fed-batch vacuum evaporative crystallizer.

Keywords: crystal size, regression model, soft sensor, sugar, vacuum evaporative crystallizer

Procedia PDF Downloads 181
17963 Machine Learning Model to Predict TB Bacteria-Resistant Drugs from TB Isolates

Authors: Rosa Tsegaye Aga, Xuan Jiang, Pavel Vazquez Faci, Siqing Liu, Simon Rayner, Endalkachew Alemu, Markos Abebe

Abstract:

Tuberculosis (TB) is a major cause of disease globally. In most cases, TB is treatable and curable, but only with the proper treatment. There is a time when drug-resistant TB occurs when bacteria become resistant to the drugs that are used to treat TB. Current strategies to identify drug-resistant TB bacteria are laboratory-based, and it takes a longer time to identify the drug-resistant bacteria and treat the patient accordingly. But machine learning (ML) and data science approaches can offer new approaches to the problem. In this study, we propose to develop an ML-based model to predict the antibiotic resistance phenotypes of TB isolates in minutes and give the right treatment to the patient immediately. The study has been using the whole genome sequence (WGS) of TB isolates as training data that have been extracted from the NCBI repository and contain different countries’ samples to build the ML models. The reason that different countries’ samples have been included is to generalize the large group of TB isolates from different regions in the world. This supports the model to train different behaviors of the TB bacteria and makes the model robust. The model training has been considering three pieces of information that have been extracted from the WGS data to train the model. These are all variants that have been found within the candidate genes (F1), predetermined resistance-associated variants (F2), and only resistance-associated gene information for the particular drug. Two major datasets have been constructed using these three information. F1 and F2 information have been considered as two independent datasets, and the third information is used as a class to label the two datasets. Five machine learning algorithms have been considered to train the model. These are Support Vector Machine (SVM), Random forest (RF), Logistic regression (LR), Gradient Boosting, and Ada boost algorithms. The models have been trained on the datasets F1, F2, and F1F2 that is the F1 and the F2 dataset merged. Additionally, an ensemble approach has been used to train the model. The ensemble approach has been considered to run F1 and F2 datasets on gradient boosting algorithm and use the output as one dataset that is called F1F2 ensemble dataset and train a model using this dataset on the five algorithms. As the experiment shows, the ensemble approach model that has been trained on the Gradient Boosting algorithm outperformed the rest of the models. In conclusion, this study suggests the ensemble approach, that is, the RF + Gradient boosting model, to predict the antibiotic resistance phenotypes of TB isolates by outperforming the rest of the models.

Keywords: machine learning, MTB, WGS, drug resistant TB

Procedia PDF Downloads 21
17962 Multi-Linear Regression Based Prediction of Mass Transfer by Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

The paper aims to compare the performance of vertical and inclined multiple plunging jets and to model and predict their mass transfer capacity by multi-linear regression based approach. The multiple vertical plunging jets have jet impact angle of θ = 90O; whereas, multiple inclined plunging jets have jet impact angle of θ = 600. The results of the study suggests that mass transfer is higher for multiple jets, and inclined multiple plunging jets have up to 1.6 times higher mass transfer than vertical multiple plunging jets under similar conditions. The derived relationship, based on multi-linear regression approach, has successfully predicted the volumetric mass transfer coefficient (KLa) from operational parameters of multiple plunging jets with a correlation coefficient of 0.973, root mean square error of 0.002 and coefficient of determination of 0.946. The results suggests that predicted overall mass transfer coefficient is in good agreement with actual experimental values; thereby suggesting the utility of derived relationship based on multi-linear regression based approach and can be successfully employed in modelling mass transfer by multiple plunging jets.

Keywords: mass transfer, multiple plunging jets, multi-linear regression, earth sciences

Procedia PDF Downloads 428
17961 Sero-Prevalence of Hepatitis B Surface Antigen and Associated Factors among Pregnant Mothers Attending Antenatal Care Service, Mekelle, Ethiopia: Evidence from Institutional Based Quantitative Cross-Sectional Study

Authors: Semaw A., Awet H., Yohannes M.

Abstract:

Background: Hepatitis B Virus (HBV) is a major global public health problem. Individuals living in Sub-Sahara Africa have 60% lifetime risk of acquiring HBV infection. Evidences showed that 80-90% of those born from infected mothers developed chronic HBV. Perinatal HBV transmission is a major determinant of HBV carrier status, its chronic squeal and maintains HBV transmission across generations. Method: Institution based cross-sectional study was conducted among 406 pregnant mothers attending Antenatal clinics at Mekelle and Ayder referral hospital from January 30 to April 1/2014. Epidata version 3.1 was used for data entry and SPSS version 21 statistical software was used for data cleaning, management and finally determine associated factors of hepatitis B surface antigen adjusting important confounders using multivariable logistic regression analysis at 5% level of significance. Result: The overall prevalence of hepatitis B surface antigen among pregnant women was 33 (8.1%). The socio-demographic characteristic of the study population showed that there is high positivity among secondary school 189 (46.6%). In the multivariable logistic regression analysis, history of a contact with individuals who had history of hepatitis B infection or jaundice and lifetime number of multiple sexual partners were found to be significantly associated with HBsAg positivity at AOR = 3.73 95%C.I (1.373-10.182) and AOR = 2.57 95%C.I (1.173-5.654), respectively. Moreover, Human Immunodeficiency Virus (HIV) and HBV confection rate was found 3.6%. Conclusion: This study has shown that HBV prevalence in pregnant women is highly prevalent (8.1%) in the study area. Contact with individuals who had a history of hepatitis or have jaundice and report of multiple lifetime sexual partnership were associated with hepatitis B infection. Education about HBV transmission and prevention as well as screening all pregnant mothers shall be sought to reduce the serious public health crisis of HBV.

Keywords: HBsAg, hepatitis B, pregnant women, prevalence

Procedia PDF Downloads 305
17960 Modeling Geogenic Groundwater Contamination Risk with the Groundwater Assessment Platform (GAP)

Authors: Joel Podgorski, Manouchehr Amini, Annette Johnson, Michael Berg

Abstract:

One-third of the world’s population relies on groundwater for its drinking water. Natural geogenic arsenic and fluoride contaminate ~10% of wells. Prolonged exposure to high levels of arsenic can result in various internal cancers, while high levels of fluoride are responsible for the development of dental and crippling skeletal fluorosis. In poor urban and rural settings, the provision of drinking water free of geogenic contamination can be a major challenge. In order to efficiently apply limited resources in the testing of wells, water resource managers need to know where geogenically contaminated groundwater is likely to occur. The Groundwater Assessment Platform (GAP) fulfills this need by providing state-of-the-art global arsenic and fluoride contamination hazard maps as well as enabling users to create their own groundwater quality models. The global risk models were produced by logistic regression of arsenic and fluoride measurements using predictor variables of various soil, geological and climate parameters. The maps display the probability of encountering concentrations of arsenic or fluoride exceeding the World Health Organization’s (WHO) stipulated concentration limits of 10 µg/L or 1.5 mg/L, respectively. In addition to a reconsideration of the relevant geochemical settings, these second-generation maps represent a great improvement over the previous risk maps due to a significant increase in data quantity and resolution. For example, there is a 10-fold increase in the number of measured data points, and the resolution of predictor variables is generally 60 times greater. These same predictor variable datasets are available on the GAP platform for visualization as well as for use with a modeling tool. The latter requires that users upload their own concentration measurements and select the predictor variables that they wish to incorporate in their models. In addition, users can upload additional predictor variable datasets either as features or coverages. Such models can represent an improvement over the global models already supplied, since (a) users may be able to use their own, more detailed datasets of measured concentrations and (b) the various processes leading to arsenic and fluoride groundwater contamination can be isolated more effectively on a smaller scale, thereby resulting in a more accurate model. All maps, including user-created risk models, can be downloaded as PDFs. There is also the option to share data in a secure environment as well as the possibility to collaborate in a secure environment through the creation of communities. In summary, GAP provides users with the means to reliably and efficiently produce models specific to their region of interest by making available the latest datasets of predictor variables along with the necessary modeling infrastructure.

Keywords: arsenic, fluoride, groundwater contamination, logistic regression

Procedia PDF Downloads 314
17959 Climate Changes in Albania and Their Effect on Cereal Yield

Authors: Lule Basha, Eralda Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.

Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest

Procedia PDF Downloads 57
17958 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 250
17957 Impact Logistic Management to Reduce Costs

Authors: Waleerak Sittisom

Abstract:

The objectives of this research were to analyze transportation route management, to identify potential cost reductions in logistic operation. In-depth interview techniques and small group discussions were utilized with 25 participants from various backgrounds in the areas of logistics. The findings of this research revealed that there were four areas that companies are able to effectively manage a logistic cost reduction: managing the space within the transportation vehicles, managing transportation personnel, managing transportation cost, and managing control of transportation. On the other hand, there were four areas that companies were unable to effectively manage a logistic cost reduction: the working process of transportation, the route planning of transportation, the service point management, and technology management. There are five areas that cost reduction is feasible: personnel management, process of working, map planning, service point planning, and technology implementation. To be able to reduce costs, the transportation companies should suggest that customers use a file system to save truck space. Also, the transportation companies need to adopt new technology to manage their information system so that packages can be reached easy, safe, and fast. Staff needs to be trained regularly to increase knowledge and skills. Teamwork is required to effectively reduce the costs.

Keywords: cost reduction, management, logistics, transportation

Procedia PDF Downloads 470
17956 Economic Analysis of Cowpea (Unguiculata spp) Production in Northern Nigeria: A Case Study of Kano Katsina and Jigawa States

Authors: Yakubu Suleiman, S. A. Musa

Abstract:

Nigeria is the largest cowpea producer in the world, accounting for about 45%, followed by Brazil with about 17%. Cowpea is grown in Kano, Bauchi, Katsina, Borno in the north, Oyo in the west, and to the lesser extent in Enugu in the east. This study was conducted to determine the input–output relationship of Cowpea production in Kano, Katsina, and Jigawa states of Nigeria. The data were collected with the aid of 1000 structured questionnaires that were randomly distributed to Cowpea farmers in the three states mentioned above of the study area. The data collected were analyzed using regression analysis (Cobb–Douglass production function model). The result of the regression analysis revealed the coefficient of multiple determinations, R2, to be 72.5% and the F ration to be 106.20 and was found to be significant (P < 0.01). The regression coefficient of constant is 0.5382 and is significant (P < 0.01). The regression coefficient with respect to labor and seeds were 0.65554 and 0.4336, respectively, and they are highly significant (P < 0.01). The regression coefficient with respect to fertilizer is 0.26341 which is significant (P < 0.05). This implies that a unit increase of any one of the variable inputs used while holding all other variables inputs constants, will significantly increase the total Cowpea output by their corresponding coefficient. This indicated that farmers in the study area are operating in stage II of the production function. The result revealed that Cowpea farmer in Kano, Jigawa and Katsina States realized a profit of N15,997, N34,016 and N19,788 per hectare respectively. It is hereby recommended that more attention should be given to Cowpea production by government and research institutions.

Keywords: coefficient, constant, inputs, regression

Procedia PDF Downloads 386
17955 Psychological Impact of the COVID-19 Pandemic on Health Care Workers in Tunisia: Risk and Protective Factor

Authors: Ahmed Sami Hammami, Mohamed Jellazi

Abstract:

Background: The aim of the study is to evaluate the magnitude of different psychological outcomes among Tunisian health care professionals (HCP) during the COVID-19 pandemic and to identify the associated factors. Methods: HCP completed a cross-sectional questionnaire from April 4th to April, 28th 2020. The survey collected demographic information, factors that may interfere with the psychological outcomes, behavior changes and mental health measurements. The latter was assessed through 3 scales; the 7-item questions Insomnia Severity Index, the 2-item Patient Health Questionnaire and the 2-item Generalized Anxiety Disorder. Multivariable logistic regression was conducted to identify factors associated with psychological outcomes. Results: A total of 503 HCP successfully completed the survey; among those, n=493 consented to enroll in the study, 411 [83.4%] were physicians, 323 [64.2%] were women and 271 [55%] had a second-line working position. A significant proportion of HCP had anxiety 35.7%, depression 35.1% and insomnia 23.7%. Females, those with psychiatric history and those using public transport exhibited the highest proportions for overall symptoms compared to other groups e.g., depression among females vs. males: 44,9% vs. 18,2%, P=0.00. Those with a previous medical history and nurses, had more anxiety and insomnia compared to other groups e.g. anxiety among nurses vs. interns/residents vs. attending 45,1% vs 36,1% vs 27,5%; p=0.04. Multivariable logistic regression showed that female gender was a risk factor for all psychological outcomes e.g. female sex increased the odds of anxiety by 2.86; 95% confidence interval [CI], 1, 78-4, 60; P=0.00, whereas having a psychiatric history was a risk factor for both anxiety and insomnia. (e.g. for insomnia OR=2,86; 95% [CI], 1,78-4,60; P=0.00), Having protective equipment was associated with lower risk for depression (OR=0,41; 95% CI, 0,27-0,62; P=0.00) and anxiety. Physical activity was also protective against depression and anxiety (OR=0,41, 95% CI, 0,25-0,67, P=0.00). Conclusion: Psychological symptoms are usually undervalued among HCP, though the COVID-19 pandemic played a major role in exacerbating this burden. Prompt psychological support should be endorsed and simple measures such as physical activity and ensuring the necessary protection are paramount to improve mental health outcomes and the quality of care provided to patients.

Keywords: COVID-19 pandemic, health care professionals, mental health, protective factors, psychological symptoms, risk factors

Procedia PDF Downloads 167
17954 Model Averaging in a Multiplicative Heteroscedastic Model

Authors: Alan Wan

Abstract:

In recent years, the body of literature on frequentist model averaging in statistics has grown significantly. Most of this work focuses on models with different mean structures but leaves out the variance consideration. In this paper, we consider a regression model with multiplicative heteroscedasticity and develop a model averaging method that combines maximum likelihood estimators of unknown parameters in both the mean and variance functions of the model. Our weight choice criterion is based on a minimisation of a plug-in estimator of the model average estimator's squared prediction risk. We prove that the new estimator possesses an asymptotic optimality property. Our investigation of finite-sample performance by simulations demonstrates that the new estimator frequently exhibits very favourable properties compared to some existing heteroscedasticity-robust model average estimators. The model averaging method hedges against the selection of very bad models and serves as a remedy to variance function misspecification, which often discourages practitioners from modeling heteroscedasticity altogether. The proposed model average estimator is applied to the analysis of two real data sets.

Keywords: heteroscedasticity-robust, model averaging, multiplicative heteroscedasticity, plug-in, squared prediction risk

Procedia PDF Downloads 336
17953 Is Socio-Economic Characteristic is Associated with Health-Related Quality of Life among Elderly: Evidence from SAGE Data in India

Authors: Mili Dutta, Lokender Prashad

Abstract:

Introduction: Population ageing is a phenomenon that can be observed around the globe. The health-related quality of life (HRQOL) is a measurement of health status of an individual, and it describes the effect of physical and mental health disorders on the well-being of a person. The present study is aimed to describe the influence of socio-economic characteristics of elderly on their health-related quality of life in India. Methods: EQ-5D instrument and population-based EQ-5D index score has been measured to access the HRQOL among elderly. Present study utilized the Study on Global Ageing and Adult Health (SAGE) data which was conducted in 2007 in India. Multiple Logistic Regression model and Multivariate Linear Regression model has been employed. Result: In the present study, it was found that the female are more likely to have problems in mobility (OR=1.41, 95% Cl: 1.14 to 1.74), self-care (OR=1.26, 95% Cl: 1.01 to 1.56) and pain or discomfort (OR=1.50, 95% Cl: 1.16 to 1.94). Elderly residing in rural area are more likely to have problems in pain/discomfort (OR=1.28, 95% Cl: 1.01 to 1.62). More older and non-working elderly are more likely whereas higher educated and highest wealth quintile elderly are less likely to have problems in all the dimensions of EQ-5D viz. mobility, self-care, usual activity, pain/discomfort and anxiety/depression. The present study has also shown that oldest old people, residing in rural area and currently not working elderly are more likely to report low EQ-5D index score whereas elderly with high education level and high wealth quintile are more likely to report high EQ-5D index score than their counterparts. Conclusion: The present study has found EQ-5D instrument as the valid measure for assessing the HRQOL of elderly in India. The study indicates socio-economic characteristics of elderly such as female, more older people, residing in rural area, non-educated, poor and currently non-working as the major risk groups of having poor HRQOL in India. Findings of the study will be helpful for the programmes and policy makers, researchers, academician and social workers who are working in the field of ageing.

Keywords: ageing, HRQOL, India, EQ-5D, SAGE, socio-economic characteristics

Procedia PDF Downloads 375
17952 Free Fatty Acid Assessment of Crude Palm Oil Using a Non-Destructive Approach

Authors: Siti Nurhidayah Naqiah Abdull Rani, Herlina Abdul Rahim, Rashidah Ghazali, Noramli Abdul Razak

Abstract:

Near infrared (NIR) spectroscopy has always been of great interest in the food and agriculture industries. The development of prediction models has facilitated the estimation process in recent years. In this study, 110 crude palm oil (CPO) samples were used to build a free fatty acid (FFA) prediction model. 60% of the collected data were used for training purposes and the remaining 40% used for testing. The visible peaks on the NIR spectrum were at 1725 nm and 1760 nm, indicating the existence of the first overtone of C-H bands. Principal component regression (PCR) was applied to the data in order to build this mathematical prediction model. The optimal number of principal components was 10. The results showed R2=0.7147 for the training set and R2=0.6404 for the testing set.

Keywords: palm oil, fatty acid, NIRS, regression

Procedia PDF Downloads 477