Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 18893

Search results for: logistic regression model

18593 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 240

18592 Neighborhood Linking Social Capital as a Predictor of Drug Abuse: A Swedish National Cohort Study

Authors: X. Li, J. Sundquist, C. Sjöstedt, M. Winkleby, K. S. Kendler, K. Sundquist

Abstract:

Aims: This study examines the association between the incidence of drug abuse (DA) and linking (communal) social capital, a theoretical concept describing the amount of trust between individuals and societal institutions. Methods: We present results from an 8-year population-based cohort study that followed all residents in Sweden, aged 15-44, from 2003 through 2010, for a total of 1,700,896 men and 1,642,798 women. Social capital was conceptualized as the proportion of people in a geographically defined neighborhood who voted in local government elections. Multilevel logistic regression was used to estimate odds ratios (ORs) and between-neighborhood variance. Results: We found robust associations between linking social capital (scored as a three level variable) and DA in men and women. For men, the OR for DA in the crude model was 2.11 [95% confidence interval (CI) 2.02-2.21] for those living in areas with the lowest vs. highest level of social capital. After accounting for neighborhood-level deprivation, the OR fell to 1.59 (1.51-1-68), indicating that neighborhood deprivation lies in the pathway between linking social capital and DA. The ORs remained significant after accounting for age, sex, family income, marital status, country of birth, education level, and region of residence, and after further accounting for comorbidities and family history of comorbidities and family history of DA. For women, the OR decreased from 2.15 (2.03-2.27) in the crude model to 1.31 (1.22-1.40) in the final model, adjusted for multiple neighborhood-level and individual-level variables. Conclusions: Our study suggests that low linking social capital may have important independent effects on DA.

Keywords: drug abuse, social linking capital, environment, family

Procedia PDF Downloads 473

18591 Using Predictive Analytics to Identify First-Year Engineering Students at Risk of Failing

Authors: Beng Yew Low, Cher Liang Cha, Cheng Yong Teoh

Abstract:

Due to a lack of continual assessment or grade related data, identifying first-year engineering students in a polytechnic education at risk of failing is challenging. Our experience over the years tells us that there is no strong correlation between having good entry grades in Mathematics and the Sciences and excelling in hardcore engineering subjects. Hence, identifying students at risk of failure cannot be on the basis of entry grades in Mathematics and the Sciences alone. These factors compound the difficulty of early identification and intervention. This paper describes the development of a predictive analytics model in the early detection of students at risk of failing and evaluates its effectiveness. Data from continual assessments conducted in term one, supplemented by data of student psychological profiles such as interests and study habits, were used. Three classification techniques, namely Logistic Regression, K Nearest Neighbour, and Random Forest, were used in our predictive model. Based on our findings, Random Forest was determined to be the strongest predictor with an Area Under the Curve (AUC) value of 0.994. Correspondingly, the Accuracy, Precision, Recall, and F-Score were also highest among these three classifiers. Using this Random Forest Classification technique, students at risk of failure could be identified at the end of term one. They could then be assigned to a Learning Support Programme at the beginning of term two. This paper gathers the results of our findings. It also proposes further improvements that can be made to the model.

Keywords: continual assessment, predictive analytics, random forest, student psychological profile

Procedia PDF Downloads 134

18590 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 168

18589 A Research on Tourism Market Forecast and Its Evaluation

Authors: Min Wei

Abstract:

The traditional prediction methods of the forecast for tourism market are paid more attention to the accuracy of the forecasts, ignoring the results of the feasibility of forecasting and predicting operability, which had made it difficult to predict the results of scientific testing. With the application of Linear Regression Model, this paper attempts to construct a scientific evaluation system for predictive value, both to ensure the accuracy, stability of the predicted value, and to ensure the feasibility of forecasting and predicting the results of operation. The findings show is that a scientific evaluation system can implement the scientific concept of development, the harmonious development of man and nature co-ordinate.

Keywords: linear regression model, tourism market, forecast, tourism economics

Procedia PDF Downloads 332

18588 Assessment of Level of Sedation and Associated Factors Among Intubated Critically Ill Children in Pediatric Intensive Care Unit of Jimma University Medical Center: A Fourteen Months Prospective Observation Study, 2023

Authors: Habtamu Wolde Engudai

Abstract:

Background: Sedation can be provided to facilitate a procedure or to stabilize patients admitted in pediatric intensive care unit (PICU). Sedation is often necessary to maintain optimal care for critically ill children requiring mechanical ventilation. However, if sedation is too deep or too light, it has its own adverse effects, and hence, it is important to monitor the level of sedation and maintain an optimal level. Objectives: The objective is to assess the level of sedation and associated factors among intubated critically ill children admitted to PICU of JUMC, Jimma. Methods: A prospective observation study was conducted in the PICU of JUMC in September 2021 in 105 patients who were going to be admitted to the PICU aged less than 14 and with GCS >8. Data was collected by residents and nurses working in PICU. Data entry was done by Epi data manager (version 4.6.0.2). Statistical analysis and the creation of charts is going to be performed using SPSS version 26. Data was presented as mean, percentage and standard deviation. The assumption of logistic regression and the result of the assumption will be checked. To find potential predictors, bi-variable logistic regression was used for each predictor and outcome variable. A p value of <0.05 was considered as statistically significant. Finally, findings have been presented using figures, AOR, percentages, and a summary table. Result: in this study, 105 critically ill children had been involved who were started on continuous or intermittent forms of sedative drugs. Sedation level was assessed using a comfort scale three times per day. Based on this observation, we got a 44.8% level of suboptimal sedation at the baseline, a 36.2% level of suboptimal sedation at eight hours, and a 24.8% level of suboptimal sedation at sixteen hours. There is a significant association between suboptimal sedation and duration of stay with mechanical ventilation and the rate of unplanned extubation, which was shown by P < 0.05 using the Hosmer-Lemeshow test of goodness of fit (p> 0.44).

Keywords: level of sedation, critically ill children, Pediatric intensive care unit, Jimma university

Procedia PDF Downloads 60

18587 Determining Antecedents of Employee Turnover: A Study on Blue Collar vs White Collar Workers on Marco Level

Authors: Evy Rombaut, Marie-Anne Guerry

Abstract:

Predicting voluntary turnover of employees is an important topic of study, both in academia and industry. Researchers try to uncover determinants for a broader understanding and possible prevention of turnover. In the current study, we use a data set based approach to reveal determinants for turnover, differing for blue and white collar workers. Our data set based approach made it possible to study actual turnover for more than 500000 employees in 15692 Belgian corporations. We use logistic regression to calculate individual turnover probabilities and test the goodness of our model with the AUC (area under the ROC-curve) method. The results of the study confirm the relationship of known determinants to employee turnover such as age, seniority, pay and work distance. In addition, the study unravels unknown and verifies known differences between blue and white collar workers. It shows opposite relationships to turnover for gender, marital status, the number of children, nationality, and pay.

Keywords: employee turnover, blue collar, white collar, dataset analysis

Procedia PDF Downloads 291

18586 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 493

18585 A Statistical Approach to Predict and Classify the Commercial Hatchability of Chickens Using Extrinsic Parameters of Breeders and Eggs

Authors: M. S. Wickramarachchi, L. S. Nawarathna, C. M. B. Dematawewa

Abstract:

Hatchery performance is critical for the profitability of poultry breeder operations. Some extrinsic parameters of eggs and breeders cause to increase or decrease the hatchability. This study aims to identify the affecting extrinsic parameters on the commercial hatchability of local chicken's eggs and determine the most efficient classification model with a hatchability rate greater than 90%. In this study, seven extrinsic parameters were considered: egg weight, moisture loss, breeders age, number of fertilised eggs, shell width, shell length, and shell thickness. Multiple linear regression was performed to determine the most influencing variable on hatchability. First, the correlation between each parameter and hatchability were checked. Then a multiple regression model was developed, and the accuracy of the fitted model was evaluated. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbors (kNN), Support Vector Machines (SVM) with a linear kernel, and Random Forest (RF) algorithms were applied to classify the hatchability. This grouping process was conducted using binary classification techniques. Hatchability was negatively correlated with egg weight, breeders' age, shell width, shell length, and positive correlations were identified with moisture loss, number of fertilised eggs, and shell thickness. Multiple linear regression models were more accurate than single linear models regarding the highest coefficient of determination (R²) with 94% and minimum AIC and BIC values. According to the classification results, RF, CART, and kNN had performed the highest accuracy values 0.99, 0.975, and 0.972, respectively, for the commercial hatchery process. Therefore, the RF is the most appropriate machine learning algorithm for classifying the breeder outcomes, which are economically profitable or not, in a commercial hatchery.

Keywords: classification models, egg weight, fertilised eggs, multiple linear regression

Procedia PDF Downloads 87

18584 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 107

18583 Landslide Susceptibility Mapping Using Soft Computing in Amhara Saint

Authors: Semachew M. Kassa, Africa M Geremew, Tezera F. Azmatch, Nandyala Darga Kumar

Abstract:

Frequency ratio (FR) and analytical hierarchy process (AHP) methods are developed based on past landslide failure points to identify the landslide susceptibility mapping because landslides can seriously harm both the environment and society. However, it is still difficult to select the most efficient method and correctly identify the main driving factors for particular regions. In this study, we used fourteen landslide conditioning factors (LCFs) and five soft computing algorithms, including Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), Artificial Neural Network (ANN), and Naïve Bayes (NB), to predict the landslide susceptibility at 12.5 m spatial scale. The performance of the RF (F1-score: 0.88, AUC: 0.94), ANN (F1-score: 0.85, AUC: 0.92), and SVM (F1-score: 0.82, AUC: 0.86) methods was significantly better than the LR (F1-score: 0.75, AUC: 0.76) and NB (F1-score: 0.73, AUC: 0.75) method, according to the classification results based on inventory landslide points. The findings also showed that around 35% of the study region was made up of places with high and very high landslide risk (susceptibility greater than 0.5). The very high-risk locations were primarily found in the western and southeastern regions, and all five models showed good agreement and similar geographic distribution patterns in landslide susceptibility. The towns with the highest landslide risk include Amhara Saint Town's western part, the Northern part, and St. Gebreal Church villages, with mean susceptibility values greater than 0.5. However, rainfall, distance to road, and slope were typically among the top leading factors for most villages. The primary contributing factors to landslide vulnerability were slightly varied for the five models. Decision-makers and policy planners can use the information from our study to make informed decisions and establish policies. It also suggests that various places should take different safeguards to reduce or prevent serious damage from landslide events.

Keywords: artificial neural network, logistic regression, landslide susceptibility, naïve Bayes, random forest, support vector machine

Procedia PDF Downloads 82

18582 Employee Aggression, Labeling and Emotional Intelligence

Authors: Martin Popescu D. Dana Maria

Abstract:

The aims of this research are to broaden the study on the relationship between emotional intelligence and counterproductive work behavior (CWB). The study sample consisted in 441 Romanian employees from companies all over the country. Data has been collected through web surveys and processed with SPSS. The results indicated an average correlation between the two constructs and their sub variables, employees with a high level of emotional intelligence tend to be less aggressive. In addition, labeling was considered an individual difference which has the power to influence the level of employee aggression. A regression model was used to underline the importance of emotional intelligence together with labeling as predictors of CWB. Results have shown that this regression model enforces the assumption that labeling and emotional intelligence, taken together, predict CWB. Employees, who label themselves as victims and have a low degree of emotional intelligence, have a higher level of CWB.

Keywords: aggression, CWB, emotional intelligence, labeling

Procedia PDF Downloads 473

18581 Modelling Agricultural Commodity Price Volatility with Markov-Switching Regression, Single Regime GARCH and Markov-Switching GARCH Models: Empirical Evidence from South Africa

Authors: Yegnanew A. Shiferaw

Abstract:

Background: commodity price volatility originating from excessive commodity price fluctuation has been a global problem especially after the recent financial crises. Volatility is a measure of risk or uncertainty in financial analysis. It plays a vital role in risk management, portfolio management, and pricing equity. Objectives: the core objective of this paper is to examine the relationship between the prices of agricultural commodities with oil price, gas price, coal price and exchange rate (USD/Rand). In addition, the paper tries to fit an appropriate model that best describes the log return price volatility and estimate Value-at-Risk and expected shortfall. Data and methods: the data used in this study are the daily returns of agricultural commodity prices from 02 January 2007 to 31st October 2016. The data sets consists of the daily returns of agricultural commodity prices namely: white maize, yellow maize, wheat, sunflower, soya, corn, and sorghum. The paper applies the three-state Markov-switching (MS) regression, the standard single-regime GARCH and the two regime Markov-switching GARCH (MS-GARCH) models. Results: to choose the best fit model, the log-likelihood function, Akaike information criterion (AIC), Bayesian information criterion (BIC) and deviance information criterion (DIC) are employed under three distributions for innovations. The results indicate that: (i) the price of agricultural commodities was found to be significantly associated with the price of coal, price of natural gas, price of oil and exchange rate, (ii) for all agricultural commodities except sunflower, k=3 had higher log-likelihood values and lower AIC and BIC values. Thus, the three-state MS regression model outperformed the two-state MS regression model (iii) MS-GARCH(1,1) with generalized error distribution (ged) innovation performs best for white maize and yellow maize; MS-GARCH(1,1) with student-t distribution (std) innovation performs better for sorghum; MS-gjrGARCH(1,1) with ged innovation performs better for wheat, sunflower and soya and MS-GARCH(1,1) with std innovation performs better for corn. In conclusion, this paper provided a practical guide for modelling agricultural commodity prices by MS regression and MS-GARCH processes. This paper can be good as a reference when facing modelling agricultural commodity price problems.

Keywords: commodity prices, MS-GARCH model, MS regression model, South Africa, volatility

Procedia PDF Downloads 202

18580 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 175

18579 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 305

18578 Internet Addiction among Students: An Empirical Study in Pondicherry University

Authors: Mashood C., Abdul Vahid K., Ashique C. K.

Abstract:

The technology is growing beyond human expectation. Internet is one of very sophisticated product of the information technology. It has various advantages like connecting the world, simplifying the difficult tasks done in past etc. Simultaneously it has demerits also; that is lack of authenticity and internet addiction. To find out the problems of internet addiction, a study conducted among the Postgraduate students of Pondicherry University and collected 454 samples. The study strictly focused to identify the internet addiction among students, influence and interdependence of personality on internet addiction among first years and second years. To evaluate this, we used two major analysis, these are Confirmatory Factor Analysis (CFA) to predict the internet addiction with the observed data and Logistic Regression to identify the difference between first years and second years in the case of internet addiction. Before applying to the core analysis, the data applied to some preliminary tests to check the model fit. The empirical findings shows that , the students of Pondicherry University are very much addicted to the internet, But there is no such huge difference between first years and second years in case of internet addiction.

Keywords: internet addiction, students, Pondicherry University, empirical study

Procedia PDF Downloads 459

18577 Association Between Short-term NOx Exposure and Asthma Exacerbations in East London: A Time Series Regression Model

Authors: Hajar Hajmohammadi, Paul Pfeffer, Anna De Simoni, Jim Cole, Chris Griffiths, Sally Hull, Benjamin Heydecker

Abstract:

Background: There is strong interest in the relationship between short-term air pollution exposure and human health. Most studies in this field focus on serious health effects such as death or hospital admission, but air pollution exposure affects many people with less severe impacts, such as exacerbations of respiratory conditions. A lack of quantitative analysis and inconsistent findings suggest improved methodology is needed to understand these effectsmore fully. Method: We developed a time series regression model to quantify the relationship between daily NOₓ concentration and Asthma exacerbations requiring oral steroids from primary care settings. Explanatory variables include daily NOₓ concentration measurements extracted from 8 available background and roadside monitoring stations in east London and daily ambient temperature extracted for London City Airport, located in east London. Lags of NOx concentrations up to 21 days (3 weeks) were used in the model. The dependent variable was the daily number of oral steroid courses prescribed for GP registered patients with asthma in east London. A mixed distribution model was then fitted to the significant lags of the regression model. Result: Results of the time series modelling showed a significant relationship between NOₓconcentrations on each day and the number of oral steroid courses prescribed in the following three weeks. In addition, the model using only roadside stations performs better than the model with a mixture of roadside and background stations.

Keywords: air pollution, time series modeling, public health, road transport

Procedia PDF Downloads 142

18576 Factors Associated with Recruitment and Adherence for Virtual Mindfulness Interventions in Youths

Authors: Kimberly Belfry, Shavon Stafford, Fariha Chowdhury, Jennifer Crawford, Soyeon Kim

Abstract:

Intervention programs are mostly delivered online during the pandemic. Screen fatigue has become a significant deterrent for virtually-deliveredinterventions, and thus, we aimed to examine factors associated with recruitment and adherence toan online mindfulness program for youths. Our preliminary analysis indicated that 40% of interested youths enrolled in the program. No difference in gender and age was found for those enrolled in the program. Adherence rate was approximately 25%, which warrants further examination. Grounding on the preliminary findings, we will conduct a binary logistic regression analysis to identify elements associated with recruitment and adherence. The model will include predictors such as age, sex, recruiter, mental health status, time of the year. Odds ratios and 95% CI will be reported. Our preliminary analysis showed low recruitment and adherence rate. By identifying elements associated with recruitment and adherence, our study provides transferrable information that can improve recruitment and adherence of online-delivered interventions offered during the pandemic.

Keywords: virtual interventions, recruitment, youth, mindfulness

Procedia PDF Downloads 147

18575 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon

Procedia PDF Downloads 652

18574 Impact Logistic Management to Reduce Costs

Authors: Waleerak Sittisom

Abstract:

The objectives of this research were to analyze transportation route management, to identify potential cost reductions in logistic operation. In-depth interview techniques and small group discussions were utilized with 25 participants from various backgrounds in the areas of logistics. The findings of this research revealed that there were four areas that companies are able to effectively manage a logistic cost reduction: managing the space within the transportation vehicles, managing transportation personnel, managing transportation cost, and managing control of transportation. On the other hand, there were four areas that companies were unable to effectively manage a logistic cost reduction: the working process of transportation, the route planning of transportation, the service point management, and technology management. There are five areas that cost reduction is feasible: personnel management, process of working, map planning, service point planning, and technology implementation. To be able to reduce costs, the transportation companies should suggest that customers use a file system to save truck space. Also, the transportation companies need to adopt new technology to manage their information system so that packages can be reached easy, safe, and fast. Staff needs to be trained regularly to increase knowledge and skills. Teamwork is required to effectively reduce the costs.

Keywords: cost reduction, management, logistics, transportation

Procedia PDF Downloads 498

18573 Sero-Prevalence of Hepatitis B Surface Antigen and Associated Factors among Pregnant Mothers Attending Antenatal Care Service, Mekelle, Ethiopia: Evidence from Institutional Based Quantitative Cross-Sectional Study

Authors: Semaw A., Awet H., Yohannes M.

Abstract:

Background: Hepatitis B Virus (HBV) is a major global public health problem. Individuals living in Sub-Sahara Africa have 60% lifetime risk of acquiring HBV infection. Evidences showed that 80-90% of those born from infected mothers developed chronic HBV. Perinatal HBV transmission is a major determinant of HBV carrier status, its chronic squeal and maintains HBV transmission across generations. Method: Institution based cross-sectional study was conducted among 406 pregnant mothers attending Antenatal clinics at Mekelle and Ayder referral hospital from January 30 to April 1/2014. Epidata version 3.1 was used for data entry and SPSS version 21 statistical software was used for data cleaning, management and finally determine associated factors of hepatitis B surface antigen adjusting important confounders using multivariable logistic regression analysis at 5% level of significance. Result: The overall prevalence of hepatitis B surface antigen among pregnant women was 33 (8.1%). The socio-demographic characteristic of the study population showed that there is high positivity among secondary school 189 (46.6%). In the multivariable logistic regression analysis, history of a contact with individuals who had history of hepatitis B infection or jaundice and lifetime number of multiple sexual partners were found to be significantly associated with HBsAg positivity at AOR = 3.73 95%C.I (1.373-10.182) and AOR = 2.57 95%C.I (1.173-5.654), respectively. Moreover, Human Immunodeficiency Virus (HIV) and HBV confection rate was found 3.6%. Conclusion: This study has shown that HBV prevalence in pregnant women is highly prevalent (8.1%) in the study area. Contact with individuals who had a history of hepatitis or have jaundice and report of multiple lifetime sexual partnership were associated with hepatitis B infection. Education about HBV transmission and prevention as well as screening all pregnant mothers shall be sought to reduce the serious public health crisis of HBV.

Keywords: HBsAg, hepatitis B, pregnant women, prevalence

Procedia PDF Downloads 340

18572 Predicting Options Prices Using Machine Learning

Authors: Krishang Surapaneni

Abstract:

The goal of this project is to determine how to predict important aspects of options, including the ask price. We want to compare different machine learning models to learn the best model and the best hyperparameters for that model for this purpose and data set. Option pricing is a relatively new field, and it can be very complicated and intimidating, especially to inexperienced people, so we want to create a machine learning model that can predict important aspects of an option stock, which can aid in future research. We tested multiple different models and experimented with hyperparameter tuning, trying to find some of the best parameters for a machine-learning model. We tested three different models: a Random Forest Regressor, a linear regressor, and an MLP (multi-layer perceptron) regressor. The most important feature in this experiment is the ask price; this is what we were trying to predict. In the field of stock pricing prediction, there is a large potential for error, so we are unable to determine the accuracy of the models based on if they predict the pricing perfectly. Due to this factor, we determined the accuracy of the model by finding the average percentage difference between the predicted and actual values. We tested the accuracy of the machine learning models by comparing the actual results in the testing data and the predictions made by the models. The linear regression model performed worst, with an average percentage error of 17.46%. The MLP regressor had an average percentage error of 11.45%, and the random forest regressor had an average percentage error of 7.42%

Keywords: finance, linear regression model, machine learning model, neural network, stock price

Procedia PDF Downloads 75

18571 Detecting Cyberbullying, Spam and Bot Behavior and Fake News in Social Media Accounts Using Machine Learning

Authors: M. D. D. Chathurangi, M. G. K. Nayanathara, K. M. H. M. M. Gunapala, G. M. R. G. Dayananda, Kavinga Yapa Abeywardena, Deemantha Siriwardana

Abstract:

Due to the growing popularity of social media platforms at present, there are various concerns, mostly cyberbullying, spam, bot accounts, and the spread of incorrect information. To develop a risk score calculation system as a thorough method for deciphering and exposing unethical social media profiles, this research explores the most suitable algorithms to our best knowledge in detecting the mentioned concerns. Various multiple models, such as Naïve Bayes, CNN, KNN, Stochastic Gradient Descent, Gradient Boosting Classifier, etc., were examined, and the best results were taken into the development of the risk score system. For cyberbullying, the Logistic Regression algorithm achieved an accuracy of 84.9%, while the spam-detecting MLP model gained 98.02% accuracy. The bot accounts identifying the Random Forest algorithm obtained 91.06% accuracy, and 84% accuracy was acquired for fake news detection using SVM.

Keywords: cyberbullying, spam behavior, bot accounts, fake news, machine learning

Procedia PDF Downloads 36

18570 Estimating Anthropometric Dimensions for Saudi Males Using Artificial Neural Networks

Authors: Waleed Basuliman

Abstract:

Anthropometric dimensions are considered one of the important factors when designing human-machine systems. In this study, the estimation of anthropometric dimensions has been improved by using Artificial Neural Network (ANN) model that is able to predict the anthropometric measurements of Saudi males in Riyadh City. A total of 1427 Saudi males aged 6 to 60 years participated in measuring 20 anthropometric dimensions. These anthropometric measurements are considered important for designing the work and life applications in Saudi Arabia. The data were collected during eight months from different locations in Riyadh City. Five of these dimensions were used as predictors variables (inputs) of the model, and the remaining 15 dimensions were set to be the measured variables (Model’s outcomes). The hidden layers varied during the structuring stage, and the best performance was achieved with the network structure 6-25-15. The results showed that the developed Neural Network model was able to estimate the body dimensions of Saudi male population in Riyadh City. The network's mean absolute percentage error (MAPE) and the root mean squared error (RMSE) were found to be 0.0348 and 3.225, respectively. These results were found less, and then better, than the errors found in the literature. Finally, the accuracy of the developed neural network was evaluated by comparing the predicted outcomes with regression model. The ANN model showed higher coefficient of determination (R2) between the predicted and actual dimensions than the regression model.

Keywords: artificial neural network, anthropometric measurements, back-propagation

Procedia PDF Downloads 487

18569 Modeling Geogenic Groundwater Contamination Risk with the Groundwater Assessment Platform (GAP)

Authors: Joel Podgorski, Manouchehr Amini, Annette Johnson, Michael Berg

Abstract:

One-third of the world’s population relies on groundwater for its drinking water. Natural geogenic arsenic and fluoride contaminate ~10% of wells. Prolonged exposure to high levels of arsenic can result in various internal cancers, while high levels of fluoride are responsible for the development of dental and crippling skeletal fluorosis. In poor urban and rural settings, the provision of drinking water free of geogenic contamination can be a major challenge. In order to efficiently apply limited resources in the testing of wells, water resource managers need to know where geogenically contaminated groundwater is likely to occur. The Groundwater Assessment Platform (GAP) fulfills this need by providing state-of-the-art global arsenic and fluoride contamination hazard maps as well as enabling users to create their own groundwater quality models. The global risk models were produced by logistic regression of arsenic and fluoride measurements using predictor variables of various soil, geological and climate parameters. The maps display the probability of encountering concentrations of arsenic or fluoride exceeding the World Health Organization’s (WHO) stipulated concentration limits of 10 µg/L or 1.5 mg/L, respectively. In addition to a reconsideration of the relevant geochemical settings, these second-generation maps represent a great improvement over the previous risk maps due to a significant increase in data quantity and resolution. For example, there is a 10-fold increase in the number of measured data points, and the resolution of predictor variables is generally 60 times greater. These same predictor variable datasets are available on the GAP platform for visualization as well as for use with a modeling tool. The latter requires that users upload their own concentration measurements and select the predictor variables that they wish to incorporate in their models. In addition, users can upload additional predictor variable datasets either as features or coverages. Such models can represent an improvement over the global models already supplied, since (a) users may be able to use their own, more detailed datasets of measured concentrations and (b) the various processes leading to arsenic and fluoride groundwater contamination can be isolated more effectively on a smaller scale, thereby resulting in a more accurate model. All maps, including user-created risk models, can be downloaded as PDFs. There is also the option to share data in a secure environment as well as the possibility to collaborate in a secure environment through the creation of communities. In summary, GAP provides users with the means to reliably and efficiently produce models specific to their region of interest by making available the latest datasets of predictor variables along with the necessary modeling infrastructure.

Keywords: arsenic, fluoride, groundwater contamination, logistic regression

Procedia PDF Downloads 348

18568 Psychological Impact of the COVID-19 Pandemic on Health Care Workers in Tunisia: Risk and Protective Factor

Authors: Ahmed Sami Hammami, Mohamed Jellazi

Abstract:

Background: The aim of the study is to evaluate the magnitude of different psychological outcomes among Tunisian health care professionals (HCP) during the COVID-19 pandemic and to identify the associated factors. Methods: HCP completed a cross-sectional questionnaire from April 4th to April, 28th 2020. The survey collected demographic information, factors that may interfere with the psychological outcomes, behavior changes and mental health measurements. The latter was assessed through 3 scales; the 7-item questions Insomnia Severity Index, the 2-item Patient Health Questionnaire and the 2-item Generalized Anxiety Disorder. Multivariable logistic regression was conducted to identify factors associated with psychological outcomes. Results: A total of 503 HCP successfully completed the survey; among those, n=493 consented to enroll in the study, 411 [83.4%] were physicians, 323 [64.2%] were women and 271 [55%] had a second-line working position. A significant proportion of HCP had anxiety 35.7%, depression 35.1% and insomnia 23.7%. Females, those with psychiatric history and those using public transport exhibited the highest proportions for overall symptoms compared to other groups e.g., depression among females vs. males: 44,9% vs. 18,2%, P=0.00. Those with a previous medical history and nurses, had more anxiety and insomnia compared to other groups e.g. anxiety among nurses vs. interns/residents vs. attending 45,1% vs 36,1% vs 27,5%; p=0.04. Multivariable logistic regression showed that female gender was a risk factor for all psychological outcomes e.g. female sex increased the odds of anxiety by 2.86; 95% confidence interval [CI], 1, 78-4, 60; P=0.00, whereas having a psychiatric history was a risk factor for both anxiety and insomnia. (e.g. for insomnia OR=2,86; 95% [CI], 1,78-4,60; P=0.00), Having protective equipment was associated with lower risk for depression (OR=0,41; 95% CI, 0,27-0,62; P=0.00) and anxiety. Physical activity was also protective against depression and anxiety (OR=0,41, 95% CI, 0,25-0,67, P=0.00). Conclusion: Psychological symptoms are usually undervalued among HCP, though the COVID-19 pandemic played a major role in exacerbating this burden. Prompt psychological support should be endorsed and simple measures such as physical activity and ensuring the necessary protection are paramount to improve mental health outcomes and the quality of care provided to patients.

Keywords: COVID-19 pandemic, health care professionals, mental health, protective factors, psychological symptoms, risk factors

Procedia PDF Downloads 195

18567 Factors Affecting Bus Use as a Sustainable Mode of Transportation: Insights from Kerman, Iran

Authors: Fatemeh Rahmani, Navid Nadimi, Vahid Khalifeh

Abstract:

In the near future, cities with medium populations will face traffic congestion, air pollution, high fuel consumption, and noise pollution. It is possible to improve the sustainability of cities by utilizing public transportation. A study of the factors that influence citizens' bus usage in medium-sized cities is presented in this paper. For this purpose, Kerman's citizens were surveyed online. The model was based on a binary logistic regression. A descriptive analysis revealed that simple measures like renewing the fleet, upgrading the stations, establishing a schedule program, and cleaning the buses could improve passenger satisfaction. In addition, the modeling results showed that future traffic congestion can be prevented by implementing road and parking lot pricing plans. Further, as the number and length of trips increases, the probability of citizens taking the bus increases. In conclusion, Kerman's bus system is both secure and fast, but these two characteristics can be improved to increase bus ridership.

Keywords: sustainability, transportation, bus, congestion, satisfaction

Procedia PDF Downloads 10

18566 On Increase and Development Prospects of Competitiveness of Georgia’s Transport-Logistical System on the Contemporary Stage

Authors: Ketevan Goletiani

Abstract:

MMultimodal transport is Europe-Asia’s rational decision of the XXI century. Success prerequisite of this form of cargo carriage is not technologic decision, but the comprehensive attitude towards it. Integration of the transport industry must refer to both technical and organizational-economic fields. Support of the multimodal’s must be the priority of the transport policy in different organizations of Europe and Asia. The method of approach to the transport as a unified system has been changed to a certain extent in the market conditions. Nowadays the competition between the different kinds of transport is not to be considered as a competition of one kind of transport towards another one, but is to be considered as a stimulator of the transport development. Basically, transport logistic, as the recent methodology and organization of the rationally flow of cargos at the specialized logistic centres during their procession provides effective rise of such flow of cargos, decreases non-operating expenses and gives the opportunity to the transport companies to come along with the time, to meet market clients’ requirements. It is apparent that the advanced transport-forwarding and logistic firms are being analized.

Keywords: transport systems, multimodal transport, competition, transport logistics

Procedia PDF Downloads 437

18565 On Differential Growth Equation to Stochastic Growth Model Using Hyperbolic Sine Function in Height/Diameter Modeling of Pines

Authors: S. O. Oyamakin, A. U. Chukwu

Abstract:

Richard's growth equation being a generalized logistic growth equation was improved upon by introducing an allometric parameter using the hyperbolic sine function. The integral solution to this was called hyperbolic Richard's growth model having transformed the solution from deterministic to a stochastic growth model. Its ability in model prediction was compared with the classical Richard's growth model an approach which mimicked the natural variability of heights/diameter increment with respect to age and therefore provides a more realistic height/diameter predictions using the coefficient of determination (R2), Mean Absolute Error (MAE) and Mean Square Error (MSE) results. The Kolmogorov-Smirnov test and Shapiro-Wilk test was also used to test the behavior of the error term for possible violations. The mean function of top height/Dbh over age using the two models under study predicted closely the observed values of top height/Dbh in the hyperbolic Richard's nonlinear growth models better than the classical Richard's growth model.

Keywords: height, Dbh, forest, Pinus caribaea, hyperbolic, Richard's, stochastic

Procedia PDF Downloads 480

18564 Unraveling Language Contact through Syntactic Dynamics of ‘Also’ in Hong Kong and Britain English

Authors: Xu Zhang

Abstract:

This article unveils an indicator of language contact between English and Cantonese in one of the Outer Circle Englishes, Hong Kong (HK) English, through an empirical investigation into 1000 tokens from the Global Web-based English (GloWbE) corpus, employing frequency analysis and logistic regression analysis. It is perceived that Cantonese and general Chinese are contextually marked by an integral underlying thinking pattern. Chinese speakers exhibit a reliance on semantic context over syntactic rules and lexical forms. This linguistic trait carries over to their use of English, affording greater flexibility to formal elements in constructing English sentences. The study focuses on the syntactic positioning of the focusing subjunct ‘also’, a linguistic element used to add new or contrasting prominence to specific sentence constituents. The English language generally allows flexibility in the relative position of 'also’, while there is a preference for close marking relationships. This article shifts attention to Hong Kong, where Cantonese and English converge, and 'also' finds counterparts in Cantonese ‘jaa’ and Mandarin ‘ye’. Employing a corpus-based data-driven method, we investigate the syntactic position of 'also' in both HK and GB English. The study aims to ascertain whether HK English exhibits a greater 'syntactic freedom,' allowing for a more distant marking relationship with 'also' compared to GB English. The analysis involves a random extraction of 500 samples from both HK and GB English from the GloWbE corpus, forming a dataset (N=1000). Exclusions are made for cases where 'also' functions as an additive conjunct or serves as a copulative adverb, as well as sentences lacking sufficient indication that 'also' functions as a focusing particle. The final dataset comprises 820 tokens, with 416 for GB and 404 for HK, annotated according to the focused constituent and the relative position of ‘also’. Frequency analysis reveals significant differences in the relative position of 'also' and marking relationships between HK and GB English. Regression analysis indicates a preference in HK English for a distant marking relationship between 'also' and its focused constituent. Notably, the subject and other constituents emerge as significant predictors of a distant position for 'also.' Together, these findings underscore the nuanced linguistic dynamics in HK English and contribute to our understanding of language contact. It suggests that future pedagogical practice should consider incorporating the syntactic variation within English varieties, facilitating leaners’ effective communication in diverse English-speaking environments and enhancing their intercultural communication competence.

Keywords: also, Cantonese, English, focus marker, frequency analysis, language contact, logistic regression analysis

Procedia PDF Downloads 55