Search results for: logistic regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18240

Search results for: logistic regression model

17970 Ketones Emission during Pad Printing Process

Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Oros B. Ivana, Kecić S. Vesna, Djogo Z. Maja

Abstract:

The paper investigates the effect of light intensity on the formation of two ketones, acetone and methyl ethyl ketone, in working premises of five pad printing departments in Novi Sad, Serbia. Multiple linear regression analysis examined the form of interdependency concentrations of methyl ethyl ketone, acetone and light intensity in five printing presses at seven sampling points, using Statistica software package version 10th. The results show an average stacking variation investigated variable and can be presented by the general regression model: y = b0 + b1xi1 + b2xi2.

Keywords: acetone, methyl ethyl ketone, multiple linear regression analysis, pad printing

Procedia PDF Downloads 386
17969 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 129
17968 River Catchment’s Demography and the Dynamics of Access to Clean Water in the Rural South Africa

Authors: Yiseyon Sunday Hosu, Motebang Dominic Vincent Nakin, Elphina N. Cishe

Abstract:

Universal access to clean and safe drinking water and basic sanitation is one of the targets of the 6th Sustainable Development Goals (SDGs). This paper explores the evidence-based indicators of Water Rights Acts (2013) among households in the rural communities in the Mthatha River catchment of OR Tambo District Municipality of South Africa. Daily access to minimum 25 litres/person and the factors influencing clean water access were investigated in the catchment. A total number of 420 households were surveyed in the upper, peri-urban, lower and coastal regions of Mthatha Rivier catchment. Descriptive and logistic regression analyses were conducted on the data collected from the households to elicit vital information on domestic water security among rural community dwellers. The results show that approximately 68 percent of total households surveyed have access to the required minimum 25 litre/person/day, with 66.3 percent in upper region, 76 per cent in the peri-urban, 1.1 percent in the lower and 2.3 percent in the coastal regions. Only 30 percent among the total surveyed households had access to piped water either in the house or public taps. The logistic regression showed that access to clean water was influenced by lack of water infrastructure, proximity to urban regions, daily flow of pipe-borne water, household size and distance to public taps. This paper recommends that viable integrated rural community-based water infrastructure provision strategies between NGOs and local authority and the promotion of point of use (POU) technologies to enhance better access to clean water.

Keywords: domestic water, household technology, water security, rural community

Procedia PDF Downloads 322
17967 Qsar Studies of Certain Novel Heterocycles Derived From bis-1, 2, 4 Triazoles as Anti-Tumor Agents

Authors: Madhusudan Purohit, Stephen Philip, Bharathkumar Inturi

Abstract:

In this paper we report the quantitative structure activity relationship of novel bis-triazole derivatives for predicting the activity profile. The full model encompassed a dataset of 46 Bis- triazoles. Tripos Sybyl X 2.0 program was used to conduct CoMSIA QSAR modeling. The Partial Least-Squares (PLS) analysis method was used to conduct statistical analysis and to derive a QSAR model based on the field values of CoMSIA descriptor. The compounds were divided into test and training set. The compounds were evaluated by various CoMSIA parameters to predict the best QSAR model. An optimum numbers of components were first determined separately by cross-validation regression for CoMSIA model, which were then applied in the final analysis. A series of parameters were used for the study and the best fit model was obtained using donor, partition coefficient and steric parameters. The CoMSIA models demonstrated good statistical results with regression coefficient (r2) and the cross-validated coefficient (q2) of 0.575 and 0.830 respectively. The standard error for the predicted model was 0.16322. In the CoMSIA model, the steric descriptors make a marginally larger contribution than the electrostatic descriptors. The finding that the steric descriptor is the largest contributor for the CoMSIA QSAR models is consistent with the observation that more than half of the binding site area is occupied by steric regions.

Keywords: 3D QSAR, CoMSIA, triazoles, novel heterocycles

Procedia PDF Downloads 411
17966 Ethanol in Carbon Monoxide Intoxication: Focus on Delayed Neuropsychological Sequelae

Authors: Hyuk-Hoon Kim, Young Gi Min

Abstract:

Background: In carbon monoxide (CO) intoxication, the pathophysiology of delayed neurological sequelae (DNS) is very complex and remains poorly understood. And predicting whether patients who exhibit resolved acute symptoms have escaped or will experience DNS represents a very important clinical issue. Brain magnetic resonance (MR) imaging has been conducted to assess the severity of brain damage as an objective method to predict prognosis. And co-ingestion of a second poison in patients with intentional CO poisoning occurs in almost one-half of patients. Among patients with co-ingestions, 66% ingested ethanol. We assessed the effects of ethanol on neurologic sequelae prevalence in acute CO intoxication by means of abnormal lesion in brain MR. Method: This study was conducted retrospectively by collecting data for patients who visited an emergency medical center during a period of 5 years. The enrollment criteria were diagnosis of acute CO poisoning and the measurement of the serum ethanol level and history of taking a brain MR during admission period. Official readout data by radiologist are used to decide whether abnormal lesion is existed or not. The enrolled patients were divided into two groups: patients with abnormal lesion and without abnormal lesion in Brain MR. A standardized extraction using medical record was performed; Mann Whitney U test and logistic regression analysis were performed. Result: A total of 112 patients were enrolled, and 68 patients presented abnormal brain lesion on MR. The abnormal brain lesion group had lower serum ethanol level (mean, 20.14 vs 46.71 mg/dL) (p-value<0.001). In addition, univariate logistic regression analysis showed the serum ethanol level (OR, 0.99; 95% CI, 0.98 -1.00) was independently associated with the development of abnormal lesion in brain MR. Conclusion: Ethanol could have neuroprotective effect in acute CO intoxication by sedative effect in stressful situation and mitigative effect in neuro-inflammatory reaction.

Keywords: carbon monoxide, delayed neuropsychological sequelae, ethanol, intoxication, magnetic resonance

Procedia PDF Downloads 229
17965 Prediction of the Thermodynamic Properties of Hydrocarbons Using Gaussian Process Regression

Authors: N. Alhazmi

Abstract:

Knowing the thermodynamics properties of hydrocarbons is vital when it comes to analyzing the related chemical reaction outcomes and understanding the reaction process, especially in terms of petrochemical industrial applications, combustions, and catalytic reactions. However, measuring the thermodynamics properties experimentally is time-consuming and costly. In this paper, Gaussian process regression (GPR) has been used to directly predict the main thermodynamic properties - standard enthalpy of formation, standard entropy, and heat capacity -for more than 360 cyclic and non-cyclic alkanes, alkenes, and alkynes. A simple workflow has been proposed that can be applied to directly predict the main properties of any hydrocarbon by knowing its descriptors and chemical structure and can be generalized to predict the main properties of any material. The model was evaluated by calculating the statistical error R², which was more than 0.9794 for all the predicted properties.

Keywords: thermodynamic, Gaussian process regression, hydrocarbons, regression, supervised learning, entropy, enthalpy, heat capacity

Procedia PDF Downloads 184
17964 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest

Procedia PDF Downloads 204
17963 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness

Procedia PDF Downloads 277
17962 Factors Associated with Self-Rated Health among Persons with Disabilities: A Korean National Survey

Authors: Won-Seok Kim, Hyung-Ik Shin

Abstract:

Self-rated health (SRH) is a subjective assessment of individual health and has been identified as a strong predictor for mortality and morbidity. However few studies have been directed to the factors associated with SRH in persons with disabilities (PWD). We used data of 7th Korean national survey for 5307 PWD in 2008. Multiple logistic regression analysis was performed to find out independent risk factors for poor SRH in PWD. As a result, indicators of physical condition (poor instrumental ADL), socioeconomic disadvantages (poor education, economically inactive, low self-rated social class, medicaid in health insurance, presence of unmet need for hospital use) and social participation and networks (no use of internet service) were selected as independent risk factors for poor SRH in final model. Findings in the present study would be helpful in making a program to promote the health and narrow the gap of health status between the PWD.

Keywords: disabilities, risk factors, self-rated health, socioeconomic disadvantages, social networks

Procedia PDF Downloads 364
17961 The Role of Brooding and Reflective as Subtypes of Rumination toward Psychological Distress in University of Indonesia First-Year Undergraduate Students

Authors: Hepinda Fajari Nuharini, Sugiarti A. Musabiq

Abstract:

Background: Various and continuous pressures that exceed individual resources can cause first-year undergraduate college students to experience psychological distress. Psychological distress can occur when individuals use rumination as cognitive coping strategies. Rumination is one of the cognitive coping strategies that can be used by individuals to respond to psychological distress that causes individuals to think about the causes and consequences of events that have occurred. Rumination had two subtypes, such as brooding and reflective. Therefore, the purpose of this study was determining the role of brooding and reflective as subtypes of rumination toward psychological distress in University of Indonesia first-year undergraduate students. Methods: Participants of this study were 403 University of Indonesia first-year undergraduate students aged between 18 and 21 years old. Psychological distress measured using self reporting questionnaire (SRQ-20) and brooding and reflective as subtypes of rumination measured using Ruminative Response Scale - Short Version (RRS - Short Version). Results: Binary logistic regression analyses showed that 22.8% of the variation in psychological distress could be explained by the brooding and reflective as subtypes of rumination, while 77.2% of the variation in psychological distress could be explained by other factors (Nagelkerke R² = 0,228). The results of the binary logistic regression analysis also showed rumination subtype brooding is a significant predictor of psychological distress (b = 0,306; p < 0.05), whereas rumination subtype reflective is not a significant predictor of psychological distress (b = 0,073; p > 0.05). Conclusion: The findings of this study showed a positive relationship between brooding and psychological distress indicates that a higher level of brooding will predict higher psychological distress. Meanwhile, a negative relationship between reflective and psychological distress indicates a higher level of reflective will predict lower psychological distress in University of Indonesia first-year undergraduate students. Added Values: The psychological distress among first-year undergraduate students would then have an impact on student academic performance. Therefore, the results of this study can be used as a reference for making preventive action to reduce the percentage and impact of psychological distress among first-year undergraduate students.

Keywords: brooding as subtypes of rumination, first-year undergraduate students, psychological distress, reflective as subtypes of rumination

Procedia PDF Downloads 83
17960 Modelling the Indonesian Goverment Securities Yield Curve Using Nelson-Siegel-Svensson and Support Vector Regression

Authors: Jamilatuzzahro, Rezzy Eko Caraka

Abstract:

The yield curve is the plot of the yield to maturity of zero-coupon bonds against maturity. In practice, the yield curve is not observed but must be extracted from observed bond prices for a set of (usually) incomplete maturities. There exist many methodologies and theory to analyze of yield curve. We use two methods (the Nelson-Siegel Method, the Svensson Method, and the SVR method) in order to construct and compare our zero-coupon yield curves. The objectives of this research were: (i) to study the adequacy of NSS model and SVR to Indonesian government bonds data, (ii) to choose the best optimization or estimation method for NSS model and SVR. To obtain that objective, this research was done by the following steps: data preparation, cleaning or filtering data, modeling, and model evaluation.

Keywords: support vector regression, Nelson-Siegel-Svensson, yield curve, Indonesian government

Procedia PDF Downloads 210
17959 Prediction Factor of Recurrence Supraventricular Tachycardia After Adenosine Treatment in the Emergency Department

Authors: Welawat Tienpratarn, Chaiyaporn Yuksen, Rungrawin Promkul, Chetsadakon Jenpanitpong, Pajit Bunta, Suthap Jaiboon

Abstract:

Supraventricular tachycardia (SVT) is an abnormally fast atrial tachycardia characterized by narrow (≤ 120 ms) and constant QRS. Adenosine was the drug of choice; the first dose was 6 mg. It can be repeated with the second and third doses of 12 mg, with greater than 90% success. The study found that patients observed at 4 hours after normal sinus rhythm was no recurrence within 24 hours. The objective of this study was to investigate the factors that influence the recurrence of SVT after adenosine in the emergency department (ED). The study was conducted retrospectively exploratory model, prognostic study at the Emergency Department (ED) in Faculty of Medicine, Ramathibodi Hospital, a university-affiliated super tertiary care hospital in Bangkok, Thailand. The study was conducted for ten years period between 2010 and 2020. The inclusion criteria were age > 15 years, visiting the ED with SVT, and treating with adenosine. Those patients were recorded with the recurrence SVT in ED. The multivariable logistic regression model developed the predictive model and prediction score for recurrence PSVT. 264 patients met the study criteria. Of those, 24 patients (10%) had recurrence PSVT. Five independent factors were predictive of recurrence PSVT. There was age>65 years, heart rate (after adenosine) > 100 per min, structural heart disease, and dose of adenosine. The clinical risk score to predict recurrence PSVT is developed accuracy 74.41%. The score of >6 had the likelihood ratio of recurrence PSVT by 5.71 times. The clinical predictive score of > 6 was associated with recurrence PSVT in ED.

Keywords: supraventricular tachycardia, recurrance, emergency department, adenosine

Procedia PDF Downloads 88
17958 An Alternative Richards’ Growth Model Based on Hyperbolic Sine Function

Authors: Samuel Oluwafemi Oyamakin, Angela Unna Chukwu

Abstract:

Richrads growth equation being a generalized logistic growth equation was improved upon by introducing an allometric parameter using the hyperbolic sine function. The integral solution to this was called hyperbolic Richards growth model having transformed the solution from deterministic to a stochastic growth model. Its ability in model prediction was compared with the classical Richards growth model an approach which mimicked the natural variability of heights/diameter increment with respect to age and therefore provides a more realistic height/diameter predictions using the coefficient of determination (R2), Mean Absolute Error (MAE) and Mean Square Error (MSE) results. The Kolmogorov-Smirnov test and Shapiro-Wilk test was also used to test the behavior of the error term for possible violations. The mean function of top height/Dbh over age using the two models under study predicted closely the observed values of top height/Dbh in the hyperbolic Richards nonlinear growth models better than the classical Richards growth model.

Keywords: height, diameter at breast height, DBH, hyperbolic sine function, Pinus caribaea, Richards' growth model

Procedia PDF Downloads 360
17957 The Predictors of Student Engagement: Instructional Support vs Emotional Support

Authors: Tahani Salman Alangari

Abstract:

Student success can be impacted by internal factors such as their emotional well-being and external factors such as organizational support and instructional support in the classroom. This study is to identify at least one factor that forecasts student engagement. It is a cross-sectional, conducted on 6206 teachers and encompassed three years of data collection and observations of math instruction in approximately 50 schools and 300 classrooms. A multiple linear regression revealed that a model predicting student engagement from emotional support, classroom organization, and instructional support was significant. Four linear regression models were tested using hierarchical regression to examine the effects of independent variables: emotional support was the highest predictor of student engagement while instructional support was the lowest.

Keywords: student engagement, emotional support, organizational support, instructional support, well-being

Procedia PDF Downloads 47
17956 Long-Term Indoor Air Monitoring for Students with Emphasis on Particulate Matter (PM2.5) Exposure

Authors: Seyedtaghi Mirmohammadi, Jamshid Yazdani, Syavash Etemadi Nejad

Abstract:

One of the main indoor air parameters in classrooms is dust pollution and it depends on the particle size and exposure duration. However, there is a lake of data about the exposure level to PM2.5 concentrations in rural area classrooms. The objective of the current study was exposure assessment for PM2.5 for students in the classrooms. One year monitoring was carried out for fifteen schools by time-series sampling to evaluate the indoor air PM2.5 in the rural district of Sari city, Iran. A hygrometer and thermometer were used to measure some psychrometric parameters (temperature, relative humidity, and wind speed) and Real-Time Dust Monitor, (MicroDust Pro, Casella, UK) was used to monitor particulate matters (PM2.5) concentration. The results show the mean indoor PM2.5 concentration in the studied classrooms was 135µg/m3. The regression model indicated that a positive correlation between indoor PM2.5 concentration and relative humidity, also with distance from city center and classroom size. Meanwhile, the regression model revealed that the indoor PM2.5 concentration, the relative humidity, and dry bulb temperature was significant at 0.05, 0.035, and 0.05 levels, respectively. A statistical predictive model was obtained from multiple regressions modeling for indoor PM2.5 concentration and indoor psychrometric parameters conditions.

Keywords: classrooms, concentration, humidity, particulate matters, regression

Procedia PDF Downloads 305
17955 A Research on Tourism Market Forecast and Its Evaluation

Authors: Min Wei

Abstract:

The traditional prediction methods of the forecast for tourism market are paid more attention to the accuracy of the forecasts, ignoring the results of the feasibility of forecasting and predicting operability, which had made it difficult to predict the results of scientific testing. With the application of Linear Regression Model, this paper attempts to construct a scientific evaluation system for predictive value, both to ensure the accuracy, stability of the predicted value, and to ensure the feasibility of forecasting and predicting the results of operation. The findings show is that a scientific evaluation system can implement the scientific concept of development, the harmonious development of man and nature co-ordinate.

Keywords: linear regression model, tourism market, forecast, tourism economics

Procedia PDF Downloads 294
17954 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 208
17953 Analyzing Impacts of Road Network on Vegetation Using Geographic Information System and Remote Sensing Techniques

Authors: Elizabeth Malebogo Mosepele

Abstract:

Road transport has become increasingly common in the world; people rely on road networks for transportation purpose on a daily basis. However, environmental impact of roads on surrounding landscapes extends their potential effects even further. This study investigates the impact of road network on natural vegetation. The study will provide baseline knowledge regarding roadside vegetation and would be helpful in future for conservation of biodiversity along the road verges and improvements of road verges. The general hypothesis of this study is that the amount and condition of road side vegetation could be explained by road network conditions. Remote sensing techniques were used to analyze vegetation conditions. Landsat 8 OLI image was used to assess vegetation cover condition. NDVI image was generated and used as a base from which land cover classes were extracted, comprising four categories viz. healthy vegetation, degraded vegetation, bare surface, and water. The classification of the image was achieved using the supervised classification technique. Road networks were digitized from Google Earth. For observed data, transect based quadrats of 50*50 m were conducted next to road segments for vegetation assessment. Vegetation condition was related to road network, with the multinomial logistic regression confirming a significant relationship between vegetation condition and road network. The null hypothesis formulated was that 'there is no variation in vegetation condition as we move away from the road.' Analysis of vegetation condition revealed degraded vegetation within close proximity of a road segment and healthy vegetation as the distance increase away from the road. The Chi Squared value was compared with critical value of 3.84, at the significance level of 0.05 to determine the significance of relationship. Given that the Chi squared value was 395, 5004, the null hypothesis was therefore rejected; there is significant variation in vegetation the distance increases away from the road. The conclusion is that the road network plays an important role in the condition of vegetation.

Keywords: Chi squared, geographic information system, multinomial logistic regression, remote sensing, road side vegetation

Procedia PDF Downloads 397
17952 Women and Food Security: Evidence from Bangladesh Demographic Health Survey 2011

Authors: Abdullah Al. Morshed, Mohammad Nahid Mia

Abstract:

Introduction: Food security refers to the availability of food and a person’s access to it. It is a complex sustainable development issue, which is closely related to under-nutrition. Food security, in turn, can widely affect the living standard, and is rooted in poverty and leads to poor health, low productivity, low income, food shortage, and hunger. The study's aim was to identify the most vulnerable women who are in insecure positions. Method: 17,842 married women were selected for analysis from the Bangladesh Demographic and Health Survey 2011. Food security defined as dichotomous variables of skipped meals and eaten less food at least once in the last year. The outcome variables were cross-tabulated with women's socio-demographic characteristics and chi2 test was applied to see the significance. Logistic regression models were applied to identify the most vulnerable groups in terms of food security. Result: Only 18.5% of women said that they ever had to skip meals in the last year. 45.7% women from low socioeconomic status had skip meal for at least once whereas only 3.6% were from women with highest socioeconomic status. Women meal skipping was ranged from 1.4% to 34.2% by their educational status. 22% of women were eaten less food during the last year. The rate was higher among the poorest (51.6%), illiterate (39.9%) and household have no electricity connection (38.1) in compared with richest (4.4%), higher educated (2.0%), and household has electricity connection (14.0%). The logistic regression analysis indicated that household socioeconomic status, and women education show strong gradients to skip meals. Poorest have had higher odds (20.9) than richest and illiterate women had 7.7 higher odds than higher educated. In terms of religion, Christianity was 2.3 times more likely to skip their meals than Islam. On the other hand, a similar trend was observed in our other outcome variable eat less food. Conclusion: In this study we able to identify women with lower economics status and women with no education were mostly suffered group from starvation.

Keywords: food security, hunger, under-nutrition, women

Procedia PDF Downloads 342
17951 A Statistical Approach to Predict and Classify the Commercial Hatchability of Chickens Using Extrinsic Parameters of Breeders and Eggs

Authors: M. S. Wickramarachchi, L. S. Nawarathna, C. M. B. Dematawewa

Abstract:

Hatchery performance is critical for the profitability of poultry breeder operations. Some extrinsic parameters of eggs and breeders cause to increase or decrease the hatchability. This study aims to identify the affecting extrinsic parameters on the commercial hatchability of local chicken's eggs and determine the most efficient classification model with a hatchability rate greater than 90%. In this study, seven extrinsic parameters were considered: egg weight, moisture loss, breeders age, number of fertilised eggs, shell width, shell length, and shell thickness. Multiple linear regression was performed to determine the most influencing variable on hatchability. First, the correlation between each parameter and hatchability were checked. Then a multiple regression model was developed, and the accuracy of the fitted model was evaluated. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbors (kNN), Support Vector Machines (SVM) with a linear kernel, and Random Forest (RF) algorithms were applied to classify the hatchability. This grouping process was conducted using binary classification techniques. Hatchability was negatively correlated with egg weight, breeders' age, shell width, shell length, and positive correlations were identified with moisture loss, number of fertilised eggs, and shell thickness. Multiple linear regression models were more accurate than single linear models regarding the highest coefficient of determination (R²) with 94% and minimum AIC and BIC values. According to the classification results, RF, CART, and kNN had performed the highest accuracy values 0.99, 0.975, and 0.972, respectively, for the commercial hatchery process. Therefore, the RF is the most appropriate machine learning algorithm for classifying the breeder outcomes, which are economically profitable or not, in a commercial hatchery.

Keywords: classification models, egg weight, fertilised eggs, multiple linear regression

Procedia PDF Downloads 52
17950 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 462
17949 Neighborhood Linking Social Capital as a Predictor of Drug Abuse: A Swedish National Cohort Study

Authors: X. Li, J. Sundquist, C. Sjöstedt, M. Winkleby, K. S. Kendler, K. Sundquist

Abstract:

Aims: This study examines the association between the incidence of drug abuse (DA) and linking (communal) social capital, a theoretical concept describing the amount of trust between individuals and societal institutions. Methods: We present results from an 8-year population-based cohort study that followed all residents in Sweden, aged 15-44, from 2003 through 2010, for a total of 1,700,896 men and 1,642,798 women. Social capital was conceptualized as the proportion of people in a geographically defined neighborhood who voted in local government elections. Multilevel logistic regression was used to estimate odds ratios (ORs) and between-neighborhood variance. Results: We found robust associations between linking social capital (scored as a three level variable) and DA in men and women. For men, the OR for DA in the crude model was 2.11 [95% confidence interval (CI) 2.02-2.21] for those living in areas with the lowest vs. highest level of social capital. After accounting for neighborhood-level deprivation, the OR fell to 1.59 (1.51-1-68), indicating that neighborhood deprivation lies in the pathway between linking social capital and DA. The ORs remained significant after accounting for age, sex, family income, marital status, country of birth, education level, and region of residence, and after further accounting for comorbidities and family history of comorbidities and family history of DA. For women, the OR decreased from 2.15 (2.03-2.27) in the crude model to 1.31 (1.22-1.40) in the final model, adjusted for multiple neighborhood-level and individual-level variables. Conclusions: Our study suggests that low linking social capital may have important independent effects on DA.

Keywords: drug abuse, social linking capital, environment, family

Procedia PDF Downloads 444
17948 Using Predictive Analytics to Identify First-Year Engineering Students at Risk of Failing

Authors: Beng Yew Low, Cher Liang Cha, Cheng Yong Teoh

Abstract:

Due to a lack of continual assessment or grade related data, identifying first-year engineering students in a polytechnic education at risk of failing is challenging. Our experience over the years tells us that there is no strong correlation between having good entry grades in Mathematics and the Sciences and excelling in hardcore engineering subjects. Hence, identifying students at risk of failure cannot be on the basis of entry grades in Mathematics and the Sciences alone. These factors compound the difficulty of early identification and intervention. This paper describes the development of a predictive analytics model in the early detection of students at risk of failing and evaluates its effectiveness. Data from continual assessments conducted in term one, supplemented by data of student psychological profiles such as interests and study habits, were used. Three classification techniques, namely Logistic Regression, K Nearest Neighbour, and Random Forest, were used in our predictive model. Based on our findings, Random Forest was determined to be the strongest predictor with an Area Under the Curve (AUC) value of 0.994. Correspondingly, the Accuracy, Precision, Recall, and F-Score were also highest among these three classifiers. Using this Random Forest Classification technique, students at risk of failure could be identified at the end of term one. They could then be assigned to a Learning Support Programme at the beginning of term two. This paper gathers the results of our findings. It also proposes further improvements that can be made to the model.

Keywords: continual assessment, predictive analytics, random forest, student psychological profile

Procedia PDF Downloads 95
17947 Employee Aggression, Labeling and Emotional Intelligence

Authors: Martin Popescu D. Dana Maria

Abstract:

The aims of this research are to broaden the study on the relationship between emotional intelligence and counterproductive work behavior (CWB). The study sample consisted in 441 Romanian employees from companies all over the country. Data has been collected through web surveys and processed with SPSS. The results indicated an average correlation between the two constructs and their sub variables, employees with a high level of emotional intelligence tend to be less aggressive. In addition, labeling was considered an individual difference which has the power to influence the level of employee aggression. A regression model was used to underline the importance of emotional intelligence together with labeling as predictors of CWB. Results have shown that this regression model enforces the assumption that labeling and emotional intelligence, taken together, predict CWB. Employees, who label themselves as victims and have a low degree of emotional intelligence, have a higher level of CWB.

Keywords: aggression, CWB, emotional intelligence, labeling

Procedia PDF Downloads 436
17946 Modelling Agricultural Commodity Price Volatility with Markov-Switching Regression, Single Regime GARCH and Markov-Switching GARCH Models: Empirical Evidence from South Africa

Authors: Yegnanew A. Shiferaw

Abstract:

Background: commodity price volatility originating from excessive commodity price fluctuation has been a global problem especially after the recent financial crises. Volatility is a measure of risk or uncertainty in financial analysis. It plays a vital role in risk management, portfolio management, and pricing equity. Objectives: the core objective of this paper is to examine the relationship between the prices of agricultural commodities with oil price, gas price, coal price and exchange rate (USD/Rand). In addition, the paper tries to fit an appropriate model that best describes the log return price volatility and estimate Value-at-Risk and expected shortfall. Data and methods: the data used in this study are the daily returns of agricultural commodity prices from 02 January 2007 to 31st October 2016. The data sets consists of the daily returns of agricultural commodity prices namely: white maize, yellow maize, wheat, sunflower, soya, corn, and sorghum. The paper applies the three-state Markov-switching (MS) regression, the standard single-regime GARCH and the two regime Markov-switching GARCH (MS-GARCH) models. Results: to choose the best fit model, the log-likelihood function, Akaike information criterion (AIC), Bayesian information criterion (BIC) and deviance information criterion (DIC) are employed under three distributions for innovations. The results indicate that: (i) the price of agricultural commodities was found to be significantly associated with the price of coal, price of natural gas, price of oil and exchange rate, (ii) for all agricultural commodities except sunflower, k=3 had higher log-likelihood values and lower AIC and BIC values. Thus, the three-state MS regression model outperformed the two-state MS regression model (iii) MS-GARCH(1,1) with generalized error distribution (ged) innovation performs best for white maize and yellow maize; MS-GARCH(1,1) with student-t distribution (std) innovation performs better for sorghum; MS-gjrGARCH(1,1) with ged innovation performs better for wheat, sunflower and soya and MS-GARCH(1,1) with std innovation performs better for corn. In conclusion, this paper provided a practical guide for modelling agricultural commodity prices by MS regression and MS-GARCH processes. This paper can be good as a reference when facing modelling agricultural commodity price problems.

Keywords: commodity prices, MS-GARCH model, MS regression model, South Africa, volatility

Procedia PDF Downloads 168
17945 Magnitude of Visual Impairment and Associated Factors among Adult Glaucoma Patients Attending University of Gondar, Comprehensive Specialized Hospital, Tertiary Eye Care and Training Center, Northwest Ethiopia, 2022

Authors: Getenet Shumet Birhan, Biruk Lelisa Eticha, Gizachew Tilahun Belete, Fisseha Admassu Ayele

Abstract:

Context: Glaucoma is a significant public health concern globally, being the second leading cause of blindness. This study focuses on adult glaucoma patients in Ethiopia, specifically at the University of Gondar. Research Aim: The main objective is to assess the prevalence of visual impairment and identify associated factors among adult glaucoma patients at the University of Gondar. Methodology: The study used an institution-based cross-sectional design, collecting data from 423 glaucoma patients through interviews and medical chart reviews. Descriptive statistics and logistic regression were employed for analysis. Findings: The study found a high prevalence of visual impairment (77.6%) among adult glaucoma patients, with factors such as female sex, rural residence, glaucoma type, disease stage, and duration of diagnosis significantly associated with visual impairment. Theoretical Importance: This research adds valuable insights into the prevalence and determinants of visual impairment among glaucoma patients in Ethiopia, contributing to the existing literature on eye health in low-resource settings. Data Collection: Data were collected through face-to-face interviews and medical chart reviews at the University of Gondar, utilizing a structured questionnaire. Analysis Procedures: Descriptive statistics, frequency analysis, and binary logistic regression were employed to analyze the data and identify factors associated with visual impairment in adult glaucoma patients. Question Addressed: The study sought to answer the question of the prevalence of visual impairment and its associated factors among adult glaucoma patients at the University of Gondar in Northwest Ethiopia. Conclusion: The research concludes that visual impairment is significantly high among adult glaucoma patients in this setting, with several factors playing a role in its occurrence.

Keywords: visual impairment, glaucoma, Ethiopia, Gondar

Procedia PDF Downloads 18
17944 Association Between Short-term NOx Exposure and Asthma Exacerbations in East London: A Time Series Regression Model

Authors: Hajar Hajmohammadi, Paul Pfeffer, Anna De Simoni, Jim Cole, Chris Griffiths, Sally Hull, Benjamin Heydecker

Abstract:

Background: There is strong interest in the relationship between short-term air pollution exposure and human health. Most studies in this field focus on serious health effects such as death or hospital admission, but air pollution exposure affects many people with less severe impacts, such as exacerbations of respiratory conditions. A lack of quantitative analysis and inconsistent findings suggest improved methodology is needed to understand these effectsmore fully. Method: We developed a time series regression model to quantify the relationship between daily NOₓ concentration and Asthma exacerbations requiring oral steroids from primary care settings. Explanatory variables include daily NOₓ concentration measurements extracted from 8 available background and roadside monitoring stations in east London and daily ambient temperature extracted for London City Airport, located in east London. Lags of NOx concentrations up to 21 days (3 weeks) were used in the model. The dependent variable was the daily number of oral steroid courses prescribed for GP registered patients with asthma in east London. A mixed distribution model was then fitted to the significant lags of the regression model. Result: Results of the time series modelling showed a significant relationship between NOₓconcentrations on each day and the number of oral steroid courses prescribed in the following three weeks. In addition, the model using only roadside stations performs better than the model with a mixture of roadside and background stations.

Keywords: air pollution, time series modeling, public health, road transport

Procedia PDF Downloads 109
17943 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 70
17942 Scoring System for the Prognosis of Sepsis Patients in Intensive Care Units

Authors: Javier E. García-Gallo, Nelson J. Fonseca-Ruiz, John F. Duitama-Munoz

Abstract:

Sepsis is a syndrome that occurs with physiological and biochemical abnormalities induced by severe infection and carries a high mortality and morbidity, therefore the severity of its condition must be interpreted quickly. After patient admission in an intensive care unit (ICU), it is necessary to synthesize the large volume of information that is collected from patients in a value that represents the severity of their condition. Traditional severity of illness scores seeks to be applicable to all patient populations, and usually assess in-hospital mortality. However, the use of machine learning techniques and the data of a population that shares a common characteristic could lead to the development of customized mortality prediction scores with better performance. This study presents the development of a score for the one-year mortality prediction of the patients that are admitted to an ICU with a sepsis diagnosis. 5650 ICU admissions extracted from the MIMICIII database were evaluated, divided into two groups: 70% to develop the score and 30% to validate it. Comorbidities, demographics and clinical information of the first 24 hours after the ICU admission were used to develop a mortality prediction score. LASSO (least absolute shrinkage and selection operator) and SGB (Stochastic Gradient Boosting) variable importance methodologies were used to select the set of variables that make up the developed score; each of this variables was dichotomized and a cut-off point that divides the population into two groups with different mean mortalities was found; if the patient is in the group that presents a higher mortality a one is assigned to the particular variable, otherwise a zero is assigned. These binary variables are used in a logistic regression (LR) model, and its coefficients were rounded to the nearest integer. The resulting integers are the point values that make up the score when multiplied with each binary variables and summed. The one-year mortality probability was estimated using the score as the only variable in a LR model. Predictive power of the score, was evaluated using the 1695 admissions of the validation subset obtaining an area under the receiver operating characteristic curve of 0.7528, which outperforms the results obtained with Sequential Organ Failure Assessment (SOFA), Oxford Acute Severity of Illness Score (OASIS) and Simplified Acute Physiology Score II (SAPSII) scores on the same validation subset. Observed and predicted mortality rates within estimated probabilities deciles were compared graphically and found to be similar, indicating that the risk estimate obtained with the score is close to the observed mortality, it is also observed that the number of events (deaths) is indeed increasing as the outcome go from the decile with the lowest probabilities to the decile with the highest probabilities. Sepsis is a syndrome that carries a high mortality, 43.3% for the patients included in this study; therefore, tools that help clinicians to quickly and accurately predict a worse prognosis are needed. This work demonstrates the importance of customization of mortality prediction scores since the developed score provides better performance than traditional scoring systems.

Keywords: intensive care, logistic regression model, mortality prediction, sepsis, severity of illness, stochastic gradient boosting

Procedia PDF Downloads 181
17941 Predicting Options Prices Using Machine Learning

Authors: Krishang Surapaneni

Abstract:

The goal of this project is to determine how to predict important aspects of options, including the ask price. We want to compare different machine learning models to learn the best model and the best hyperparameters for that model for this purpose and data set. Option pricing is a relatively new field, and it can be very complicated and intimidating, especially to inexperienced people, so we want to create a machine learning model that can predict important aspects of an option stock, which can aid in future research. We tested multiple different models and experimented with hyperparameter tuning, trying to find some of the best parameters for a machine-learning model. We tested three different models: a Random Forest Regressor, a linear regressor, and an MLP (multi-layer perceptron) regressor. The most important feature in this experiment is the ask price; this is what we were trying to predict. In the field of stock pricing prediction, there is a large potential for error, so we are unable to determine the accuracy of the models based on if they predict the pricing perfectly. Due to this factor, we determined the accuracy of the model by finding the average percentage difference between the predicted and actual values. We tested the accuracy of the machine learning models by comparing the actual results in the testing data and the predictions made by the models. The linear regression model performed worst, with an average percentage error of 17.46%. The MLP regressor had an average percentage error of 11.45%, and the random forest regressor had an average percentage error of 7.42%

Keywords: finance, linear regression model, machine learning model, neural network, stock price

Procedia PDF Downloads 52