Search results for: multinomial logistic regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18327

Search results for: multinomial logistic regression model

18147 Modeling Standpipe Pressure Using Multivariable Regression Analysis by Combining Drilling Parameters and a Herschel-Bulkley Model

Authors: Seydou Sinde

Abstract:

The aims of this paper are to formulate mathematical expressions that can be used to estimate the standpipe pressure (SPP). The developed formulas take into account the main factors that, directly or indirectly, affect the behavior of SPP values. Fluid rheology and well hydraulics are some of these essential factors. Mud Plastic viscosity, yield point, flow power, consistency index, flow rate, drillstring, and annular geometries are represented by the frictional pressure (Pf), which is one of the input independent parameters and is calculated, in this paper, using Herschel-Bulkley rheological model. Other input independent parameters include the rate of penetration (ROP), applied load or weight on the bit (WOB), bit revolutions per minute (RPM), bit torque (TRQ), and hole inclination and direction coupled in the hole curvature or dogleg (DL). The technique of repeating parameters and Buckingham PI theorem are used to reduce the number of the input independent parameters into the dimensionless revolutions per minute (RPMd), the dimensionless torque (TRQd), and the dogleg, which is already in the dimensionless form of radians. Multivariable linear and polynomial regression technique using PTC Mathcad Prime 4.0 is used to analyze and determine the exact relationships between the dependent parameter, which is SPP, and the remaining three dimensionless groups. Three models proved sufficiently satisfactory to estimate the standpipe pressure: multivariable linear regression model 1 containing three regression coefficients for vertical wells; multivariable linear regression model 2 containing four regression coefficients for deviated wells; and multivariable polynomial quadratic regression model containing six regression coefficients for both vertical and deviated wells. Although that the linear regression model 2 (with four coefficients) is relatively more complex and contains an additional term over the linear regression model 1 (with three coefficients), the former did not really add significant improvements to the later except for some minor values. Thus, the effect of the hole curvature or dogleg is insignificant and can be omitted from the input independent parameters without significant losses of accuracy. The polynomial quadratic regression model is considered the most accurate model due to its relatively higher accuracy for most of the cases. Data of nine wells from the Middle East were used to run the developed models with satisfactory results provided by all of them, even if the multivariable polynomial quadratic regression model gave the best and most accurate results. Development of these models is useful not only to monitor and predict, with accuracy, the values of SPP but also to early control and check for the integrity of the well hydraulics as well as to take the corrective actions should any unexpected problems appear, such as pipe washouts, jet plugging, excessive mud losses, fluid gains, kicks, etc.

Keywords: standpipe, pressure, hydraulics, nondimensionalization, parameters, regression

Procedia PDF Downloads 61
18146 Effect of Pregnancy Intention, Postnatal Depressive Symptoms and Social Support on Early Childhood Stunting: Findings from India

Authors: Swati Srivastava, Ashish Kumar Upadhyay

Abstract:

Background: According to United Nation Children’s Fund, it has been estimated that worldwide about 165 million children were stunted in 2012 and India alone accounts for 38% of global burden of stunting. In terms of incidence, India is home of more than 60 million stunted children worldwide. Our study aims to examine the effect of pregnancy intention and maternal postnatal depressive symptoms on early childhood stunting in India. We hypothesized that effect of pregnancy intention and postnatal maternal depressive symptoms were mediated by social support. Methods: We used data from first wave of Young Lives Study India. Out of 2011 children recruited in original cohort, 1833 children had complete information on pregnancy intention, maternal depression and other variables. A series of multivariate logistic regression model were used to examine the effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting. Results: Bivariate result indicates that a higher percent of children born after unintended pregnancy (40%) were stunted than children of intended pregnancy (26%). Likewise, proportion of stunted children was also higher among women of high postnatal depressive symptoms (35%) than low level of depression (24%). Results of multivariate logistic regression model indicate that children born after unintended pregnancy were significantly more likely to be stunted than children born after intended pregnancy (Coefficient: 1.70, CI: 1.17, 2.48). Likewise, early childhood stunting was also associated with maternal postnatal depressive symptoms among women (Coefficient: 1.48, CI: 1.16, 1.88). The effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting remains unchanged after controlling for social support and other variables. Conclusions: The findings of this study provide conclusive evidence regarding consequences of pregnancy intention and postnatal depressive symptoms on early childhood stunting in India. Therefore, there is need to identify the women with unintended pregnancy and incorporate the promotion of mental health into their national reproductive and child health programme.

Keywords: pregnancy intention, postnatal depressive symptoms, social support, childhood stunting, young lives study, India

Procedia PDF Downloads 274
18145 Hospital Malnutrition and its Impact on 30-day Mortality in Hospitalized General Medicine Patients in a Tertiary Hospital in South India

Authors: Vineet Agrawal, Deepanjali S., Medha R., Subitha L.

Abstract:

Background. Hospital malnutrition is a highly prevalent issue and is known to increase the morbidity, mortality, length of hospital stay, and cost of care. In India, studies on hospital malnutrition have been restricted to ICU, post-surgical, and cancer patients. We designed this study to assess the impact of hospital malnutrition on 30-day post-discharge and in-hospital mortality in patients admitted in the general medicine department, irrespective of diagnosis. Methodology. All patients aged above 18 years admitted in the medicine wards, excluding medico-legal cases, were enrolled in the study. Nutritional assessment was done within 72 h of admission, using Subjective Global Assessment (SGA), which classifies patients into three categories: Severely malnourished, Mildly/moderately malnourished, and Normal/well-nourished. Anthropometric measurements like Body Mass Index (BMI), Triceps skin-fold thickness (TSF), and Mid-upper arm circumference (MUAC) were also performed. Patients were followed-up during hospital stay and 30 days after discharge through telephonic interview, and their final diagnosis, comorbidities, and cause of death were noted. Multivariate logistic regression and cox regression model were used to determine if the nutritional status at admission independently impacted mortality at one month. Results. The prevalence of malnourishment by SGA in our study was 67.3% among 395 hospitalized patients, of which 155 patients (39.2%) were moderately malnourished, and 111 (28.1%) were severely malnourished. Of 395 patients, 61 patients (15.4%) expired, of which 30 died in the hospital, and 31 died within 1 month of discharge from hospital. On univariate analysis, malnourished patients had significantly higher morality (24.3% in 111 Cat C patients) than well-nourished patients (10.1% in 129 Cat A patients), with OR 9.17, p-value 0.007. On multivariate logistic regression, age and higher Charlson Comorbidity Index (CCI) were independently associated with mortality. Higher CCI indicates higher burden of comorbidities on admission, and the CCI in the expired patient group (mean=4.38) was significantly higher than that of the alive cohort (mean=2.85). Though malnutrition significantly contributed to higher mortality on univariate analysis, it was not an independent predictor of outcome on multivariate logistic regression. Length of hospitalisation was also longer in the malnourished group (mean= 9.4 d) compared to the well-nourished group (mean= 8.03 d) with a trend towards significance (p=0.061). None of the anthropometric measurements like BMI, MUAC, or TSF showed any association with mortality or length of hospitalisation. Inference. The results of our study highlight the issue of hospital malnutrition in medicine wards and reiterate that malnutrition contributes significantly to patient outcomes. We found that SGA performs better than anthropometric measurements in assessing under-nutrition. We are of the opinion that the heterogeneity of the study population by diagnosis was probably the primary reason why malnutrition by SGA was not found to be an independent risk factor for mortality. Strategies to identify high-risk patients at admission and treat malnutrition in the hospital and post-discharge are needed.

Keywords: hospitalization outcome, length of hospital stay, mortality, malnutrition, subjective global assessment (SGA)

Procedia PDF Downloads 121
18144 Urban Energy Demand Modelling: Spatial Analysis Approach

Authors: Hung-Chu Chen, Han Qi, Bauke de Vries

Abstract:

Energy consumption in the urban environment has attracted numerous researches in recent decades. However, it is comparatively rare to find literary works which investigated 3D spatial analysis of urban energy demand modelling. In order to analyze the spatial correlation between urban morphology and energy demand comprehensively, this paper investigates their relation by using the spatial regression tool. In addition, the spatial regression tool which is applied in this paper is ordinary least squares regression (OLS) and geographically weighted regression (GWR) model. Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), and building volume are explainers of urban morphology, which act as independent variables of Energy-land use (E-L) model. NDBI and NDVI are used as the index to describe five types of land use: urban area (U), open space (O), artificial green area (G), natural green area (V), and water body (W). Accordingly, annual electricity, gas demand and energy demand are dependent variables of the E-L model. Based on the analytical result of E-L model relation, it revealed that energy demand and urban morphology are closely connected and the possible causes and practical use are discussed. Besides, the spatial analysis methods of OLS and GWR are compared.

Keywords: energy demand model, geographically weighted regression, normalized difference built-up index, normalized difference vegetation index, spatial statistics

Procedia PDF Downloads 120
18143 Logistics Information Systems in the Distribution of Flour in Nigeria

Authors: Cornelius Femi Popoola

Abstract:

This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.

Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems

Procedia PDF Downloads 522
18142 The Reproducibility and Repeatability of Modified Likelihood Ratio for Forensics Handwriting Examination

Authors: O. Abiodun Adeyinka, B. Adeyemo Adesesan

Abstract:

The forensic use of handwriting depends on the analysis, comparison, and evaluation decisions made by forensic document examiners. When using biometric technology in forensic applications, it is necessary to compute Likelihood Ratio (LR) for quantifying strength of evidence under two competing hypotheses, namely the prosecution and the defense hypotheses wherein a set of assumptions and methods for a given data set will be made. It is therefore important to know how repeatable and reproducible our estimated LR is. This paper evaluated the accuracy and reproducibility of examiners' decisions. Confidence interval for the estimated LR were presented so as not get an incorrect estimate that will be used to deliver wrong judgment in the court of Law. The estimate of LR is fundamentally a Bayesian concept and we used two LR estimators, namely Logistic Regression (LoR) and Kernel Density Estimator (KDE) for this paper. The repeatability evaluation was carried out by retesting the initial experiment after an interval of six months to observe whether examiners would repeat their decisions for the estimated LR. The experimental results, which are based on handwriting dataset, show that LR has different confidence intervals which therefore implies that LR cannot be estimated with the same certainty everywhere. Though the LoR performed better than the KDE when tested using the same dataset, the two LR estimators investigated showed a consistent region in which LR value can be estimated confidently. These two findings advance our understanding of LR when used in computing the strength of evidence in handwriting using forensics.

Keywords: confidence interval, handwriting, kernel density estimator, KDE, logistic regression LoR, repeatability, reproducibility

Procedia PDF Downloads 97
18141 Neural Network Modelling for Turkey Railway Load Carrying Demand

Authors: Humeyra Bolakar Tosun

Abstract:

The transport sector has an undisputed place in human life. People need transport access to continuous increase day by day with growing population. The number of rail network, urban transport planning, infrastructure improvements, transportation management and other related areas is a key factor affecting our country made it quite necessary to improve the work of transportation. In this context, it plays an important role in domestic rail freight demand planning. Alternatives that the increase in the transportation field and has made it mandatory requirements such as the demand for improving transport quality. In this study generally is known and used in studies by the definition, rail freight transport, railway line length, population, energy consumption. In this study, Iron Road Load Net Demand was modeled by multiple regression and ANN methods. In this study, model dependent variable (Output) is Iron Road Load Net demand and 6 entries variable was determined. These outcome values extracted from the model using ANN and regression model results. In the regression model, some parameters are considered as determinative parameters, and the coefficients of the determinants give meaningful results. As a result, ANN model has been shown to be more successful than traditional regression model.

Keywords: railway load carrying, neural network, modelling transport, transportation

Procedia PDF Downloads 125
18140 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling

Procedia PDF Downloads 104
18139 Estimation of Functional Response Model by Supervised Functional Principal Component Analysis

Authors: Hyon I. Paek, Sang Rim Kim, Hyon A. Ryu

Abstract:

In functional linear regression, one typical problem is to reduce dimension. Compared with multivariate linear regression, functional linear regression is regarded as an infinite-dimensional case, and the main task is to reduce dimensions of functional response and functional predictors. One common approach is to adapt functional principal component analysis (FPCA) on functional predictors and then use a few leading functional principal components (FPC) to predict the functional model. The leading FPCs estimated by the typical FPCA explain a major variation of the functional predictor, but these leading FPCs may not be mostly correlated with the functional response, so they may not be significant in the prediction for response. In this paper, we propose a supervised functional principal component analysis method for a functional response model with FPCs obtained by considering the correlation of the functional response. Our method would have a better prediction accuracy than the typical FPCA method.

Keywords: supervised, functional principal component analysis, functional response, functional linear regression

Procedia PDF Downloads 42
18138 A Comparative Study on Sampling Techniques of Polynomial Regression Model Based Stochastic Free Vibration of Composite Plates

Authors: S. Dey, T. Mukhopadhyay, S. Adhikari

Abstract:

This paper presents an exhaustive comparative investigation on sampling techniques of polynomial regression model based stochastic natural frequency of composite plates. Both individual and combined variations of input parameters are considered to map the computational time and accuracy of each modelling techniques. The finite element formulation of composites is capable to deal with both correlated and uncorrelated random input variables such as fibre parameters and material properties. The results obtained by Polynomial regression (PR) using different sampling techniques are compared. Depending on the suitability of sampling techniques such as 2k Factorial designs, Central composite design, A-Optimal design, I-Optimal, D-Optimal, Taguchi’s orthogonal array design, Box-Behnken design, Latin hypercube sampling, sobol sequence are illustrated. Statistical analysis of the first three natural frequencies is presented to compare the results and its performance.

Keywords: composite plate, natural frequency, polynomial regression model, sampling technique, uncertainty quantification

Procedia PDF Downloads 483
18137 Remittances and Water Access: A Cross-Sectional Study of Sub Saharan Africa Countries

Authors: Narges Ebadi, Davod Ahmadi, Hiliary Monteith, Hugo Melgar-Quinonez

Abstract:

Migration cannot necessarily relieve pressure on water resources in origin communities, and male out-migration can increase the water management burden of women. However, inflows of financial remittances seem to offer possibilities of investing in improving drinking-water access. Therefore, remittances may be an important pathway for migrants to support water security. This paper explores the association between water access and the receipt of remittances in households in sub-Saharan Africa. Data from round 6 of the 'Afrobarometer' surveys in 2016 were used (n= 49,137). Descriptive, bivariate and multivariate statistical analyses were carried out in this study. Regardless of country, findings from descriptive analyses showed that approximately 80% of the respondents never received remittance, and 52% had enough clean water. Only one-fifth of the respondents had piped water supply inside the house (19.9%), and approximately 25% had access to a toilet inside the house. Bivariate analyses revealed that even though receiving remittances was significantly associated with water supply, the strength of association was very weak. However, other factors such as the area of residence (rural vs. urban), cash income frequencies, electricity access, and asset ownership were strongly associated with water access. Results from unadjusted multinomial logistic regression revealed that the probability of having no access to piped water increased among remittance recipients who received financial support at least once a month (OR=1.324) (p < 0.001). In contrast, those not receiving remittances were more likely to regularly have a water access concern (OR=1.294) (p < 0.001), and not have access to a latrine (OR=1.665) (p < 0.001). In conclusion, receiving remittances is significantly related to water access as the strength of odds ratios for socio-demographic factors was stronger.

Keywords: remittances, water access, SSA, migration

Procedia PDF Downloads 147
18136 HIV Disclosure Status and Factors among Women to Their Sexual Partner in Victory plus, Yogyakarta, Indonesia

Authors: Dwi Kartika Rukmi, Miftafu Darussalam

Abstract:

Background: The disclosure of women’s HIV status toward their sexual partners is an important issue that should be regarded as one of the efforts to prevent and control the spread of HIV. Research on the disclosure of seropositive HIV status as well as women-related factors in Indonesia, especially Yogyakarta is only a few. Methods: This is a correlational descriptive research along with its cross-sectional approach on 329 women with HIV/AIDS at the Victory Plus NGO from June to July 2016. This research used a purposive sampling method and a questionnaire as the data collection technique. The bivariate analysis test was undertaken by using a chi-square and multivariate test along with a logistic regression. Result: The multivariate analysis and logistic regression show five independent variables related to the disclosure of seropositive HIV status of women with HIV/AIDS toward their sexual partners, namely ethnicity (aOR = 36,859; 95% CI; (6,544-207,616)) religion (aOR =0,255; 95%CI; (0,075-0,868)), discussion with partners prior to the HIV test (aOR =0,069; 95%CI; (0,065-0,438)) , types of sexual partners (aOR = 0.191; 95% CI; (0.082-0,445)) and knowledge on the partners’ HIV status (aOR = 0.036; 95% CI; (0.008-0.160)). The highest level of reason for seropositive HIV women not to be open about their partners’ status is the fear of being rejected by their partners and the environmental stigma of HIV AIDS disease. Conclusion: The disclosure of seropositive HIV status in women with HIV/AIDS in the Victory Plus NGO of Yogyakarta was 79.4% or classified as a high category with some related factors such as ethnicity, religion, discussion with partners prior to the HIV test, types of partners and knowledge on the partners’ HIV status.

Keywords: women, HIV, disclosure, sexual partner

Procedia PDF Downloads 234
18135 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 19
18134 Giving Right-of-Way to Emergency Ambulances: Attitude and Behavior of Road Users in Developing Countries

Authors: Mahmoud T. Alwidyan, Ahmad Alrawashdeh, Alaa O. Oteir

Abstract:

Background: Emergency medical service (EMS) providers, oftentimes, use the lights and sirens (L&S) of their ambulances to warn road users, navigate through traffic, and expedite transport to save lives of ill and injured patients. Despite the contribution of road users in the effectiveness of reducing transport time of EMS ambulances using L&S, there is a lack of empirical assessments exploring the road user’s attitude and behavior in such situations. This study, therefore, aimed to assess the attitude and behavior of road users in response to EMS ambulances with warning L&S in use. Methods: This was a cross-sectional survey developed and distributed to adult road users in Northern Jordan. The questionnaire included 20 items addressing demographics, attitudes, and behavior toward emergency ambulances. We described the participants’ responses and assessed the association between demographics and attitude statements using logistic regression. Results: A total of 1302 questionnaires were complete and appropriate for analysis. The mean age was 34.2 (SD± 11.4) years, and the majority were males (72.6%). About half of road users (47.9%) in our sample would perform inappropriate action in response to EMS ambulances with L&S in use. The multivariate logistic regression model show that being female (OR, 0.63; 95% CI = 0.48-0.81), more educated (OR, 0.68; 95% CI = 0.53-0.86), or public transport driver (OR, 0.55; 95% CI = 0.34-0.90) is significantly associated with inappropriate response to EMS ambulances. Additionally, a significant proportion of road users may perform inappropriate and lawless driving practices such as crossing red traffic lights or following the passing by EMS ambulances, which would, in turn, increase the risk on ambulances and other road users. Conclusions: A large proportion of road users in Jordan may respond inappropriately to the EMS ambulances, and many engage in risky driving behaviors due perhaps to the lack of procedural knowledge. Policy-related interventions and educational programs are crucially needed to increase public awareness of the traffic law concerning EMS ambulances and to enhance appropriate driving behavior, which, in turn, improves the efficiency of ambulance services.

Keywords: EMS ambulances, lights and sirens, road users, attitude and behavior

Procedia PDF Downloads 61
18133 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 46
18132 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 20
18131 Multiobjective Optimization of a Pharmaceutical Formulation Using Regression Method

Authors: J. Satya Eswari, Ch. Venkateswarlu

Abstract:

The formulation of a commercial pharmaceutical product involves several composition factors and response characteristics. When the formulation requires to satisfy multiple response characteristics which are conflicting, an optimal solution requires the need for an efficient multiobjective optimization technique. In this work, a regression is combined with a non-dominated sorting differential evolution (NSDE) involving Naïve & Slow and ε constraint techniques to derive different multiobjective optimization strategies, which are then evaluated by means of a trapidil pharmaceutical formulation. The analysis of the results show the effectiveness of the strategy that combines the regression model and NSDE with the integration of both Naïve & Slow and ε constraint techniques for Pareto optimization of trapidil formulation. With this strategy, the optimal formulation at pH=6.8 is obtained with the decision variables of micro crystalline cellulose, hydroxypropyl methylcellulose and compression pressure. The corresponding response characteristics of rate constant and release order are also noted down. The comparison of these results with the experimental data and with those of other multiple regression model based multiobjective evolutionary optimization strategies signify the better performance for optimal trapidil formulation.

Keywords: pharmaceutical formulation, multiple regression model, response surface method, radial basis function network, differential evolution, multiobjective optimization

Procedia PDF Downloads 388
18130 Agile Software Effort Estimation Using Regression Techniques

Authors: Mikiyas Adugna

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 29
18129 Development of the Logistic Service Providers under the Pandemic Affects during COVID-19 in Turkey

Authors: Süleyman Günes

Abstract:

The crucial effects of the COVID-19 pandemic have on social and economic systems in Turkey as well as all over the world. It has impacted logistic providers and worldwide supply chains. Unexpected risks played a central role in creating vulnerabilities for logistics service operations during the pandemic terms. This study aims to research and design qualitative and quantitive contributions to logistic services. The COVID-19 pandemic brought unavoidable risks to the logistics industry in Turkey. The Logistic Service Providers (LSPs) have learned how to ensure uncertainties and risks triggered by main and adverse effects. The risks that LSPs encounter during the COVID-19 pandemic have been investigated and unveiled, and identified uncertainties and risks. The cause-effect structures were displayed by the qualitative and quantitive studies. The results suggest that supply chains and demand changes triggered by the COVID-19 pandemic while it influenced financial failure and forecast horizon with operational performances.

Keywords: logistic service providers, COVID-19, development, financial failure

Procedia PDF Downloads 55
18128 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Keywords: design of experiments, regression analysis, SI engine, statistical modeling

Procedia PDF Downloads 159
18127 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include such as the cost of infrastructure, personnel and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion and various indirect costs in terms of air transport. More research has been done to identify the various factors that affect road accidents, such as road infrastructure, traffic, sociodemographic characteristics, land use, and the environment. The aim of this research is to predict the probabilistic crash prediction of vehicles using machine learning due to natural and structural reasons by excluding spontaneous reasons like overspeeding etc., in the United States. These factors range from weather factors, like weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity to human made structures like road structure factors like bump, roundabout, no exit, turning loop, give away, etc. Probabilities are dissected into ten different classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes that happened in all states collected by the US government. To calculate the probability, multinomial expected value was used and assigned a classification label as the crash probability. We applied three different classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-deep insights through exploratory data analysis.

Keywords: road safety, crash prediction, exploratory analysis, machine learning

Procedia PDF Downloads 81
18126 Competition between Verb-Based Implicit Causality and Theme Structure's Influence on Anaphora Bias in Mandarin Chinese Sentences: Evidence from Corpus

Authors: Linnan Zhang

Abstract:

Linguists, as well as psychologists, have shown great interests in implicit causality in reference processing. However, most frequently-used approaches to this issue are psychological experiments (such as eye tracking or self-paced reading, etc.). This research is a corpus-based one and is assisted with statistical tool – software R. The main focus of the present study is about the competition between verb-based implicit causality and theme structure’s influence on anaphora bias in Mandarin Chinese sentences. In Accessibility Theory, it is believed that salience, which is also known as accessibility, and relevance are two important factors in reference processing. Theme structure, which is a special syntactic structure in Chinese, determines the salience of an antecedent on the syntactic level while verb-based implicit causality is a key factor to the relevance between antecedent and anaphora. Therefore, it is a study about anaphora, combining psychology with linguistics. With analysis of the sentences from corpus as well as the statistical analysis of Multinomial Logistic Regression, major findings of the present study are as follows: 1. When the sentence is stated in a ‘cause-effect’ structure, the theme structure will always be the antecedent no matter forward biased verbs or backward biased verbs co-occur; in non-theme structure, the anaphora bias will tend to be the opposite of the verb bias; 2. When the sentence is stated in a ‘effect-cause’ structure, theme structure will not always be the antecedent and the influence of verb-based implicit causality will outweigh that of theme structure; moreover, the anaphora bias will be the same with the bias of verbs. All the results indicate that implicit causality functions conditionally and the noun in theme structure will not be the high-salience antecedent under any circumstances.

Keywords: accessibility theory, anaphora, theme strcture, verb-based implicit causality

Procedia PDF Downloads 173
18125 A Study of Mode Choice Model Improvement Considering Age Grouping

Authors: Young-Hyun Seo, Hyunwoo Park, Dong-Kyu Kim, Seung-Young Kho

Abstract:

The purpose of this study is providing an improved mode choice model considering parameters including age grouping of prime-aged and old age. In this study, 2010 Household Travel Survey data were used and improper samples were removed through the analysis. Chosen alternative, date of birth, mode, origin code, destination code, departure time, and arrival time are considered from Household Travel Survey. By preprocessing data, travel time, travel cost, mode, and ratio of people aged 45 to 55 years, 55 to 65 years and over 65 years were calculated. After the manipulation, the mode choice model was constructed using LIMDEP by maximum likelihood estimation. A significance test was conducted for nine parameters, three age groups for three modes. Then the test was conducted again for the mode choice model with significant parameters, travel cost variable and travel time variable. As a result of the model estimation, as the age increases, the preference for the car decreases and the preference for the bus increases. This study is meaningful in that the individual and households characteristics are applied to the aggregate model.

Keywords: age grouping, aging, mode choice model, multinomial logit model

Procedia PDF Downloads 303
18124 Charting Sentiments with Naive Bayes and Logistic Regression

Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri

Abstract:

The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.

Keywords: machine learning, sentiment analysis, visualisation, python

Procedia PDF Downloads 28
18123 Coverage Probability Analysis of WiMAX Network under Additive White Gaussian Noise and Predicted Empirical Path Loss Model

Authors: Chaudhuri Manoj Kumar Swain, Susmita Das

Abstract:

This paper explores a detailed procedure of predicting a path loss (PL) model and its application in estimating the coverage probability in a WiMAX network. For this a hybrid approach is followed in predicting an empirical PL model of a 2.65 GHz WiMAX network deployed in a suburban environment. Data collection, statistical analysis, and regression analysis are the phases of operations incorporated in this approach and the importance of each of these phases has been discussed properly. The procedure of collecting data such as received signal strength indicator (RSSI) through experimental set up is demonstrated. From the collected data set, empirical PL and RSSI models are predicted with regression technique. Furthermore, with the aid of the predicted PL model, essential parameters such as PL exponent as well as the coverage probability of the network are evaluated. This research work may assist in the process of deployment and optimisation of any cellular network significantly.

Keywords: WiMAX, RSSI, path loss, coverage probability, regression analysis

Procedia PDF Downloads 144
18122 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 287
18121 Disadvantaged Adolescents and Educational Delay in South Africa: Impacts of Personal, Family, and School Characteristics

Authors: Rocio Herrero Romero, Lucie Cluver, James Hall, Janina Steinert

Abstract:

Educational delay and non-completion are major policy concerns in South Africa. However, little research has focused on predictors for educational delay amongst adolescents in disadvantaged areas. This study has two aims: first, to use data integration approaches to compare the educational delay of 599 adolescents aged 16 to 18 from disadvantaged communities to national and provincial representative estimates in South Africa. Second, the paper also explores predictors for educational delay by comparing adolescents out of school (n=64) and at least one year behind (n=380), with adolescents in the age-appropriate grade or higher (n=155). Multinomial logistic regression models using self-report and administrative data were applied to look for significant associations of risk and protective factors. Significant risk factors for being behind (rather than in age-appropriate grade) were: male gender, past grade repetition, rural location and larger school size. Risk factors for being out of school (rather than in the age-appropriate grade) were: past grade repetition, having experienced problems concentrating at school, household poverty, and food insecurity. Significant protective factors for being in the age-appropriate grade (rather than out of school) were: living with biological parents or grandparents and access to school counselling. Attending school in wealthier communities was a significant protective factor for being in the age-appropriate grade (rather than behind). Our results suggest that both personal and contextual factors –family and school- predicted educational delay. This study provides new evidence to the significant effects of personal, family, and school characteristics on the educational outcomes of adolescents from disadvantaged communities in South Africa. This is the first longitudinal and quantitative study to systematically investigate risk and protective factors for post-compulsory educational outcomes amongst South African adolescents living in disadvantaged communities.

Keywords: disadvantaged communities, quantitative analysis, school delay, South Africa

Procedia PDF Downloads 321
18120 Factors Affecting At-Grade Railway Level Crossing Accidents in Bangladesh

Authors: Armana Huq

Abstract:

Railway networks have a significant role in the economy of any country. Similar to other transportation modes, many lives suffer from fatalities or injuries caused by accidents related to the railway. Railway accidents are not as common as roadway accidents yet they are more devastating and damaging than other roadway accidents. Despite that, issues related to railway accidents are not taken into consideration with significant attention as a major threat because of their less frequency compared to other accident categories perhaps. However, the Federal Railroad Administration reported nearly twelve thousand train accidents related to the railroad in the year 2014, resulting in more than eight hundred fatalities and thousands of injuries in the United States alone of which nearly one third fatalities resulted from railway crossing accidents. From an analysis of railway accident data of six years (2005-2010), it has been revealed that 344 numbers of the collision were occurred resulting 200 people dead and 443 people injured in Bangladesh. This paper includes a comprehensive overview of the railway safety situation in Bangladesh from 1998 to 2015. Each year on average, eight fatalities are reported in at-grade level crossings due to railway accidents in Bangladesh. In this paper, the number of railway accidents that occurred in Bangladesh has been presented and a fatality rate of 58.62% has been estimated as the percentage of total at-grade railway level crossing accidents. For this study, analysis of railway accidents in Bangladesh for the period 1998 to 2015 was obtained from the police reported accident database using MAAP (Microcomputer Accident Analysis Package). Investigation of the major contributing factors to the railway accidents has been performed using the Multinomial Logit model. Furthermore, hotspot analysis has been conducted using ArcGIS. Eventually, some suggestions have been provided to mitigate those accidents.

Keywords: safety, human factors, multinomial logit model, railway

Procedia PDF Downloads 126
18119 Apricot Insurance Portfolio Risk

Authors: Kasirga Yildirak, Ismail Gur

Abstract:

We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.

Keywords: hail insurance, spherical regression, circular regression, spherical clustering

Procedia PDF Downloads 232
18118 Survival Analysis Based Delivery Time Estimates for Display FAB

Authors: Paul Han, Jun-Geol Baek

Abstract:

In the flat panel display industry, the scheduler and dispatching system to meet production target quantities and the deadline of production are the major production management system which controls each facility production order and distribution of WIP (Work in Process). In dispatching system, delivery time is a key factor for the time when a lot can be supplied to the facility. In this paper, we use survival analysis methods to identify main factors and a forecasting model of delivery time. Of survival analysis techniques to select important explanatory variables, the cox proportional hazard model is used to. To make a prediction model, the Accelerated Failure Time (AFT) model was used. Performance comparisons were conducted with two other models, which are the technical statistics model based on transfer history and the linear regression model using same explanatory variables with AFT model. As a result, the Mean Square Error (MSE) criteria, the AFT model decreased by 33.8% compared to the existing prediction model, decreased by 5.3% compared to the linear regression model. This survival analysis approach is applicable to implementing a delivery time estimator in display manufacturing. And it can contribute to improve the productivity and reliability of production management system.

Keywords: delivery time, survival analysis, Cox PH model, accelerated failure time model

Procedia PDF Downloads 506