Search results for: logistic regression model
18743 Efficient Model Selection in Linear and Non-Linear Quantile Regression by Cross-Validation
Authors: Yoonsuh Jung, Steven N. MacEachern
Abstract:
Check loss function is used to define quantile regression. In the prospect of cross validation, it is also employed as a validation function when underlying truth is unknown. However, our empirical study indicates that the validation with check loss often leads to choosing an over estimated fits. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. It has a large effect of guarding against over fitted model in some extent. Through various simulation settings of linear and non-linear regressions, the improvement of check loss by L2 adjustment is empirically examined. This adjustment is devised to shrink to zero as sample size grows.Keywords: cross-validation, model selection, quantile regression, tuning parameter selection
Procedia PDF Downloads 43818742 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms
Authors: Senol Dogan, Gunay Karli
Abstract:
Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.Keywords: prediction of publications, machine learning, algorithms, bariatric surgery, comparison of algorithms, boosted, tree, logistic regression, ANN model
Procedia PDF Downloads 20918741 Global Positioning System Match Characteristics as a Predictor of Badminton Players’ Group Classification
Authors: Yahaya Abdullahi, Ben Coetzee, Linda Van Den Berg
Abstract:
The study aimed at establishing the global positioning system (GPS) determined singles match characteristics that act as predictors of successful and less-successful male singles badminton players’ group classification. Twenty-two (22) male single players (aged: 23.39 ± 3.92 years; body stature: 177.11 ± 3.06cm; body mass: 83.46 ± 14.59kg) who represented 10 African countries participated in the study. Players were categorised as successful and less-successful players according to the results of five championships’ of the 2014/2015 season. GPS units (MinimaxX V4.0), Polar Heart Rate Transmitter Belts and digital video cameras were used to collect match data. GPS-related variables were corrected for match duration and independent t-tests, a cluster analysis and a binary forward stepwise logistic regression were calculated. A Receiver Operating Characteristic Curve (ROC) was used to determine the validity of the group classification model. High-intensity accelerations per second were identified as the only GPS-determined variable that showed a significant difference between groups. Furthermore, only high-intensity accelerations per second (p=0.03) and low-intensity efforts per second (p=0.04) were identified as significant predictors of group classification with 76.88% of players that could be classified back into their original groups by making use of the GPS-based logistic regression formula. The ROC showed a value of 0.87. The identification of the last-mentioned GPS-related variables for the attainment of badminton performances, emphasizes the importance of using badminton drills and conditioning techniques to not only improve players’ physical fitness levels but also their abilities to accelerate at high intensities.Keywords: badminton, global positioning system, match analysis, inertial movement analysis, intensity, effort
Procedia PDF Downloads 19118740 An Exploratory Study on 'Sub-Region Life Circle' in Chinese Big Cities Based on Human High-Probability Daily Activity: Characteristic and Formation Mechanism as a Case of Wuhan
Authors: Zhuoran Shan, Li Wan, Xianchun Zhang
Abstract:
With an increasing trend of regionalization and polycentricity in Chinese contemporary big cities, “sub-region life circle” turns to be an effective method on rational organization of urban function and spatial structure. By the method of questionnaire, network big data, route inversion on internet map, GIS spatial analysis and logistic regression, this article makes research on characteristic and formation mechanism of “sub-region life circle” based on human high-probability daily activity in Chinese big cities. Firstly, it shows that “sub-region life circle” has been a new general spatial sphere of residents' high-probability daily activity and mobility in China. Unlike the former analysis of the whole metropolitan or the micro community, “sub-region life circle” has its own characteristic on geographical sphere, functional element, spatial morphology and land distribution. Secondly, according to the analysis result with Binary Logistic Regression Model, the research also shows that seven factors including land-use mixed degree and bus station density impact the formation of “sub-region life circle” most, and then analyzes the index critical value of each factor. Finally, to establish a smarter “sub-region life circle”, this paper indicates that several strategies including jobs-housing fit, service cohesion and space reconstruction are the keys for its spatial organization optimization. This study expands the further understanding of cities' inner sub-region spatial structure based on human daily activity, and contributes to the theory of “life circle” in urban's meso-scale.Keywords: sub-region life circle, characteristic, formation mechanism, human activity, spatial structure
Procedia PDF Downloads 30018739 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm
Authors: Abdullah A. AlShaher
Abstract:
In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm
Procedia PDF Downloads 14518738 Hospital Malnutrition and its Impact on 30-day Mortality in Hospitalized General Medicine Patients in a Tertiary Hospital in South India
Authors: Vineet Agrawal, Deepanjali S., Medha R., Subitha L.
Abstract:
Background. Hospital malnutrition is a highly prevalent issue and is known to increase the morbidity, mortality, length of hospital stay, and cost of care. In India, studies on hospital malnutrition have been restricted to ICU, post-surgical, and cancer patients. We designed this study to assess the impact of hospital malnutrition on 30-day post-discharge and in-hospital mortality in patients admitted in the general medicine department, irrespective of diagnosis. Methodology. All patients aged above 18 years admitted in the medicine wards, excluding medico-legal cases, were enrolled in the study. Nutritional assessment was done within 72 h of admission, using Subjective Global Assessment (SGA), which classifies patients into three categories: Severely malnourished, Mildly/moderately malnourished, and Normal/well-nourished. Anthropometric measurements like Body Mass Index (BMI), Triceps skin-fold thickness (TSF), and Mid-upper arm circumference (MUAC) were also performed. Patients were followed-up during hospital stay and 30 days after discharge through telephonic interview, and their final diagnosis, comorbidities, and cause of death were noted. Multivariate logistic regression and cox regression model were used to determine if the nutritional status at admission independently impacted mortality at one month. Results. The prevalence of malnourishment by SGA in our study was 67.3% among 395 hospitalized patients, of which 155 patients (39.2%) were moderately malnourished, and 111 (28.1%) were severely malnourished. Of 395 patients, 61 patients (15.4%) expired, of which 30 died in the hospital, and 31 died within 1 month of discharge from hospital. On univariate analysis, malnourished patients had significantly higher morality (24.3% in 111 Cat C patients) than well-nourished patients (10.1% in 129 Cat A patients), with OR 9.17, p-value 0.007. On multivariate logistic regression, age and higher Charlson Comorbidity Index (CCI) were independently associated with mortality. Higher CCI indicates higher burden of comorbidities on admission, and the CCI in the expired patient group (mean=4.38) was significantly higher than that of the alive cohort (mean=2.85). Though malnutrition significantly contributed to higher mortality on univariate analysis, it was not an independent predictor of outcome on multivariate logistic regression. Length of hospitalisation was also longer in the malnourished group (mean= 9.4 d) compared to the well-nourished group (mean= 8.03 d) with a trend towards significance (p=0.061). None of the anthropometric measurements like BMI, MUAC, or TSF showed any association with mortality or length of hospitalisation. Inference. The results of our study highlight the issue of hospital malnutrition in medicine wards and reiterate that malnutrition contributes significantly to patient outcomes. We found that SGA performs better than anthropometric measurements in assessing under-nutrition. We are of the opinion that the heterogeneity of the study population by diagnosis was probably the primary reason why malnutrition by SGA was not found to be an independent risk factor for mortality. Strategies to identify high-risk patients at admission and treat malnutrition in the hospital and post-discharge are needed.Keywords: hospitalization outcome, length of hospital stay, mortality, malnutrition, subjective global assessment (SGA)
Procedia PDF Downloads 14918737 In Silico Modeling of Drugs Milk/Plasma Ratio in Human Breast Milk Using Structures Descriptors
Authors: Navid Kaboudi, Ali Shayanfar
Abstract:
Introduction: Feeding infants with safe milk from the beginning of their life is an important issue. Drugs which are used by mothers can affect the composition of milk in a way that is not only unsuitable, but also toxic for infants. Consuming permeable drugs during that sensitive period by mother could lead to serious side effects to the infant. Due to the ethical restrictions of drug testing on humans, especially women, during their lactation period, computational approaches based on structural parameters could be useful. The aim of this study is to develop mechanistic models to predict the M/P ratio of drugs during breastfeeding period based on their structural descriptors. Methods: Two hundred and nine different chemicals with their M/P ratio were used in this study. All drugs were categorized into two groups based on their M/P value as Malone classification: 1: Drugs with M/P>1, which are considered as high risk 2: Drugs with M/P>1, which are considered as low risk Thirty eight chemical descriptors were calculated by ACD/labs 6.00 and Data warrior software in order to assess the penetration during breastfeeding period. Later on, four specific models based on the number of hydrogen bond acceptors, polar surface area, total surface area, and number of acidic oxygen were established for the prediction. The mentioned descriptors can predict the penetration with an acceptable accuracy. For the remaining compounds (N= 147, 158, 160, and 174 for models 1 to 4, respectively) of each model binary regression with SPSS 21 was done in order to give us a model to predict the penetration ratio of compounds. Only structural descriptors with p-value<0.1 remained in the final model. Results and discussion: Four different models based on the number of hydrogen bond acceptors, polar surface area, and total surface area were obtained in order to predict the penetration of drugs into human milk during breastfeeding period About 3-4% of milk consists of lipids, and the amount of lipid after parturition increases. Lipid soluble drugs diffuse alongside with fats from plasma to mammary glands. lipophilicity plays a vital role in predicting the penetration class of drugs during lactation period. It was shown in the logistic regression models that compounds with number of hydrogen bond acceptors, PSA and TSA above 5, 90 and 25 respectively, are less permeable to milk because they are less soluble in the amount of fats in milk. The pH of milk is acidic and due to that, basic compounds tend to be concentrated in milk than plasma while acidic compounds may consist lower concentrations in milk than plasma. Conclusion: In this study, we developed four regression-based models to predict the penetration class of drugs during the lactation period. The obtained models can lead to a higher speed in drug development process, saving energy, and costs. Milk/plasma ratio assessment of drugs requires multiple steps of animal testing, which has its own ethical issues. QSAR modeling could help scientist to reduce the amount of animal testing, and our models are also eligible to do that.Keywords: logistic regression, breastfeeding, descriptors, penetration
Procedia PDF Downloads 7118736 Predicting Survival in Cancer: How Cox Regression Model Compares to Artifial Neural Networks?
Authors: Dalia Rimawi, Walid Salameh, Amal Al-Omari, Hadeel AbdelKhaleq
Abstract:
Predication of Survival time of patients with cancer, is a core factor that influences oncologist decisions in different aspects; such as offered treatment plans, patients’ quality of life and medications development. For a long time proportional hazards Cox regression (ph. Cox) was and still the most well-known statistical method to predict survival outcome. But due to the revolution of data sciences; new predication models were employed and proved to be more flexible and provided higher accuracy in that type of studies. Artificial neural network is one of those models that is suitable to handle time to event predication. In this study we aim to compare ph Cox regression with artificial neural network method according to data handling and Accuracy of each model.Keywords: Cox regression, neural networks, survival, cancer.
Procedia PDF Downloads 20018735 Survival and Hazard Maximum Likelihood Estimator with Covariate Based on Right Censored Data of Weibull Distribution
Authors: Al Omari Mohammed Ahmed
Abstract:
This paper focuses on Maximum Likelihood Estimator with Covariate. Covariates are incorporated into the Weibull model. Under this regression model with regards to maximum likelihood estimator, the parameters of the covariate, shape parameter, survival function and hazard rate of the Weibull regression distribution with right censored data are estimated. The mean square error (MSE) and absolute bias are used to compare the performance of Weibull regression distribution. For the simulation comparison, the study used various sample sizes and several specific values of the Weibull shape parameter.Keywords: weibull regression distribution, maximum likelihood estimator, survival function, hazard rate, right censoring
Procedia PDF Downloads 44118734 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing
Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva
Abstract:
In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity
Procedia PDF Downloads 31218733 Effect of Pregnancy Intention, Postnatal Depressive Symptoms and Social Support on Early Childhood Stunting: Findings from India
Authors: Swati Srivastava, Ashish Kumar Upadhyay
Abstract:
Background: According to United Nation Children’s Fund, it has been estimated that worldwide about 165 million children were stunted in 2012 and India alone accounts for 38% of global burden of stunting. In terms of incidence, India is home of more than 60 million stunted children worldwide. Our study aims to examine the effect of pregnancy intention and maternal postnatal depressive symptoms on early childhood stunting in India. We hypothesized that effect of pregnancy intention and postnatal maternal depressive symptoms were mediated by social support. Methods: We used data from first wave of Young Lives Study India. Out of 2011 children recruited in original cohort, 1833 children had complete information on pregnancy intention, maternal depression and other variables. A series of multivariate logistic regression model were used to examine the effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting. Results: Bivariate result indicates that a higher percent of children born after unintended pregnancy (40%) were stunted than children of intended pregnancy (26%). Likewise, proportion of stunted children was also higher among women of high postnatal depressive symptoms (35%) than low level of depression (24%). Results of multivariate logistic regression model indicate that children born after unintended pregnancy were significantly more likely to be stunted than children born after intended pregnancy (Coefficient: 1.70, CI: 1.17, 2.48). Likewise, early childhood stunting was also associated with maternal postnatal depressive symptoms among women (Coefficient: 1.48, CI: 1.16, 1.88). The effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting remains unchanged after controlling for social support and other variables. Conclusions: The findings of this study provide conclusive evidence regarding consequences of pregnancy intention and postnatal depressive symptoms on early childhood stunting in India. Therefore, there is need to identify the women with unintended pregnancy and incorporate the promotion of mental health into their national reproductive and child health programme.Keywords: pregnancy intention, postnatal depressive symptoms, social support, childhood stunting, young lives study, India
Procedia PDF Downloads 30018732 An Overbooking Model for Car Rental Service with Different Types of Cars
Authors: Naragain Phumchusri, Kittitach Pongpairoj
Abstract:
Overbooking is a very useful revenue management technique that could help reduce costs caused by either undersales or oversales. In this paper, we propose an overbooking model for two types of cars that can minimize the total cost for car rental service. With two types of cars, there is an upgrade possibility for lower type to upper type. This makes the model more complex than one type of cars scenario. We have found that convexity can be proved in this case. Sensitivity analysis of the parameters is conducted to observe the effects of relevant parameters on the optimal solution. Model simplification is proposed using multiple linear regression analysis, which can help estimate the optimal overbooking level using appropriate independent variables. The results show that the overbooking level from multiple linear regression model is relatively close to the optimal solution (with the adjusted R-squared value of at least 72.8%). To evaluate the performance of the proposed model, the total cost was compared with the case where the decision maker uses a naïve method for the overbooking level. It was found that the total cost from optimal solution is only 0.5 to 1 percent (on average) lower than the cost from regression model, while it is approximately 67% lower than the cost obtained by the naïve method. It indicates that our proposed simplification method using regression analysis can effectively perform in estimating the overbooking level.Keywords: overbooking, car rental industry, revenue management, stochastic model
Procedia PDF Downloads 17218731 Logistics Information Systems in the Distribution of Flour in Nigeria
Authors: Cornelius Femi Popoola
Abstract:
This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems
Procedia PDF Downloads 55318730 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health
Authors: Irfan Ahmad Afip
Abstract:
This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression
Procedia PDF Downloads 11518729 The Reproducibility and Repeatability of Modified Likelihood Ratio for Forensics Handwriting Examination
Authors: O. Abiodun Adeyinka, B. Adeyemo Adesesan
Abstract:
The forensic use of handwriting depends on the analysis, comparison, and evaluation decisions made by forensic document examiners. When using biometric technology in forensic applications, it is necessary to compute Likelihood Ratio (LR) for quantifying strength of evidence under two competing hypotheses, namely the prosecution and the defense hypotheses wherein a set of assumptions and methods for a given data set will be made. It is therefore important to know how repeatable and reproducible our estimated LR is. This paper evaluated the accuracy and reproducibility of examiners' decisions. Confidence interval for the estimated LR were presented so as not get an incorrect estimate that will be used to deliver wrong judgment in the court of Law. The estimate of LR is fundamentally a Bayesian concept and we used two LR estimators, namely Logistic Regression (LoR) and Kernel Density Estimator (KDE) for this paper. The repeatability evaluation was carried out by retesting the initial experiment after an interval of six months to observe whether examiners would repeat their decisions for the estimated LR. The experimental results, which are based on handwriting dataset, show that LR has different confidence intervals which therefore implies that LR cannot be estimated with the same certainty everywhere. Though the LoR performed better than the KDE when tested using the same dataset, the two LR estimators investigated showed a consistent region in which LR value can be estimated confidently. These two findings advance our understanding of LR when used in computing the strength of evidence in handwriting using forensics.Keywords: confidence interval, handwriting, kernel density estimator, KDE, logistic regression LoR, repeatability, reproducibility
Procedia PDF Downloads 12418728 Modeling Standpipe Pressure Using Multivariable Regression Analysis by Combining Drilling Parameters and a Herschel-Bulkley Model
Authors: Seydou Sinde
Abstract:
The aims of this paper are to formulate mathematical expressions that can be used to estimate the standpipe pressure (SPP). The developed formulas take into account the main factors that, directly or indirectly, affect the behavior of SPP values. Fluid rheology and well hydraulics are some of these essential factors. Mud Plastic viscosity, yield point, flow power, consistency index, flow rate, drillstring, and annular geometries are represented by the frictional pressure (Pf), which is one of the input independent parameters and is calculated, in this paper, using Herschel-Bulkley rheological model. Other input independent parameters include the rate of penetration (ROP), applied load or weight on the bit (WOB), bit revolutions per minute (RPM), bit torque (TRQ), and hole inclination and direction coupled in the hole curvature or dogleg (DL). The technique of repeating parameters and Buckingham PI theorem are used to reduce the number of the input independent parameters into the dimensionless revolutions per minute (RPMd), the dimensionless torque (TRQd), and the dogleg, which is already in the dimensionless form of radians. Multivariable linear and polynomial regression technique using PTC Mathcad Prime 4.0 is used to analyze and determine the exact relationships between the dependent parameter, which is SPP, and the remaining three dimensionless groups. Three models proved sufficiently satisfactory to estimate the standpipe pressure: multivariable linear regression model 1 containing three regression coefficients for vertical wells; multivariable linear regression model 2 containing four regression coefficients for deviated wells; and multivariable polynomial quadratic regression model containing six regression coefficients for both vertical and deviated wells. Although that the linear regression model 2 (with four coefficients) is relatively more complex and contains an additional term over the linear regression model 1 (with three coefficients), the former did not really add significant improvements to the later except for some minor values. Thus, the effect of the hole curvature or dogleg is insignificant and can be omitted from the input independent parameters without significant losses of accuracy. The polynomial quadratic regression model is considered the most accurate model due to its relatively higher accuracy for most of the cases. Data of nine wells from the Middle East were used to run the developed models with satisfactory results provided by all of them, even if the multivariable polynomial quadratic regression model gave the best and most accurate results. Development of these models is useful not only to monitor and predict, with accuracy, the values of SPP but also to early control and check for the integrity of the well hydraulics as well as to take the corrective actions should any unexpected problems appear, such as pipe washouts, jet plugging, excessive mud losses, fluid gains, kicks, etc.Keywords: standpipe, pressure, hydraulics, nondimensionalization, parameters, regression
Procedia PDF Downloads 8418727 HIV Disclosure Status and Factors among Women to Their Sexual Partner in Victory plus, Yogyakarta, Indonesia
Authors: Dwi Kartika Rukmi, Miftafu Darussalam
Abstract:
Background: The disclosure of women’s HIV status toward their sexual partners is an important issue that should be regarded as one of the efforts to prevent and control the spread of HIV. Research on the disclosure of seropositive HIV status as well as women-related factors in Indonesia, especially Yogyakarta is only a few. Methods: This is a correlational descriptive research along with its cross-sectional approach on 329 women with HIV/AIDS at the Victory Plus NGO from June to July 2016. This research used a purposive sampling method and a questionnaire as the data collection technique. The bivariate analysis test was undertaken by using a chi-square and multivariate test along with a logistic regression. Result: The multivariate analysis and logistic regression show five independent variables related to the disclosure of seropositive HIV status of women with HIV/AIDS toward their sexual partners, namely ethnicity (aOR = 36,859; 95% CI; (6,544-207,616)) religion (aOR =0,255; 95%CI; (0,075-0,868)), discussion with partners prior to the HIV test (aOR =0,069; 95%CI; (0,065-0,438)) , types of sexual partners (aOR = 0.191; 95% CI; (0.082-0,445)) and knowledge on the partners’ HIV status (aOR = 0.036; 95% CI; (0.008-0.160)). The highest level of reason for seropositive HIV women not to be open about their partners’ status is the fear of being rejected by their partners and the environmental stigma of HIV AIDS disease. Conclusion: The disclosure of seropositive HIV status in women with HIV/AIDS in the Victory Plus NGO of Yogyakarta was 79.4% or classified as a high category with some related factors such as ethnicity, religion, discussion with partners prior to the HIV test, types of partners and knowledge on the partners’ HIV status.Keywords: women, HIV, disclosure, sexual partner
Procedia PDF Downloads 26118726 Urban Energy Demand Modelling: Spatial Analysis Approach
Authors: Hung-Chu Chen, Han Qi, Bauke de Vries
Abstract:
Energy consumption in the urban environment has attracted numerous researches in recent decades. However, it is comparatively rare to find literary works which investigated 3D spatial analysis of urban energy demand modelling. In order to analyze the spatial correlation between urban morphology and energy demand comprehensively, this paper investigates their relation by using the spatial regression tool. In addition, the spatial regression tool which is applied in this paper is ordinary least squares regression (OLS) and geographically weighted regression (GWR) model. Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), and building volume are explainers of urban morphology, which act as independent variables of Energy-land use (E-L) model. NDBI and NDVI are used as the index to describe five types of land use: urban area (U), open space (O), artificial green area (G), natural green area (V), and water body (W). Accordingly, annual electricity, gas demand and energy demand are dependent variables of the E-L model. Based on the analytical result of E-L model relation, it revealed that energy demand and urban morphology are closely connected and the possible causes and practical use are discussed. Besides, the spatial analysis methods of OLS and GWR are compared.Keywords: energy demand model, geographically weighted regression, normalized difference built-up index, normalized difference vegetation index, spatial statistics
Procedia PDF Downloads 14818725 Urban Household Waste Disposal Modes and Their Determinants: Evidence from Bure Town, North-Western Ethiopia
Authors: Mastawal Melese, Yismaw Assefa
Abstract:
This study aims to identify household-level determinants of solid waste disposal (SWD) practices in Bure Town, north-western Ethiopia. Using a cross-sectional design and a mixed-methods approach, data were collected from 238 randomly selected households through structured interviews, focus group discussions, and field observations. Descriptive analysis revealed that 14.7% of households used composting as a primary SWD method, 37.4% practiced open dumping, 25.6% used burning, and 22.3% resorted to burial. Multinomial logistic regression showed that factors such as monthly income, age, family size, length of residence, sex, home ownership, solid waste sorting procedures, and education significantly influenced the choice of disposal method. Households with lower education, income, home ownership, and shorter residence times were more likely to use improper disposal methods. Females were found to be more likely to engage in better waste disposal practices than males. These findings underscore the need for context-specific interventions in newly developing towns to enhance household-level SWM systems by addressing key socio-economic factors.Keywords: multinomial logistic regression, solid waste management, solid waste disposal, urban household
Procedia PDF Downloads 2018724 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition
Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar
Abstract:
In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers
Procedia PDF Downloads 4518723 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race
Authors: Joonas Pääkkönen
Abstract:
In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling
Procedia PDF Downloads 12418722 Development of the Logistic Service Providers under the Pandemic Affects during COVID-19 in Turkey
Authors: Süleyman Günes
Abstract:
The crucial effects of the COVID-19 pandemic have on social and economic systems in Turkey as well as all over the world. It has impacted logistic providers and worldwide supply chains. Unexpected risks played a central role in creating vulnerabilities for logistics service operations during the pandemic terms. This study aims to research and design qualitative and quantitive contributions to logistic services. The COVID-19 pandemic brought unavoidable risks to the logistics industry in Turkey. The Logistic Service Providers (LSPs) have learned how to ensure uncertainties and risks triggered by main and adverse effects. The risks that LSPs encounter during the COVID-19 pandemic have been investigated and unveiled, and identified uncertainties and risks. The cause-effect structures were displayed by the qualitative and quantitive studies. The results suggest that supply chains and demand changes triggered by the COVID-19 pandemic while it influenced financial failure and forecast horizon with operational performances.Keywords: logistic service providers, COVID-19, development, financial failure
Procedia PDF Downloads 7318721 Neural Network Modelling for Turkey Railway Load Carrying Demand
Authors: Humeyra Bolakar Tosun
Abstract:
The transport sector has an undisputed place in human life. People need transport access to continuous increase day by day with growing population. The number of rail network, urban transport planning, infrastructure improvements, transportation management and other related areas is a key factor affecting our country made it quite necessary to improve the work of transportation. In this context, it plays an important role in domestic rail freight demand planning. Alternatives that the increase in the transportation field and has made it mandatory requirements such as the demand for improving transport quality. In this study generally is known and used in studies by the definition, rail freight transport, railway line length, population, energy consumption. In this study, Iron Road Load Net Demand was modeled by multiple regression and ANN methods. In this study, model dependent variable (Output) is Iron Road Load Net demand and 6 entries variable was determined. These outcome values extracted from the model using ANN and regression model results. In the regression model, some parameters are considered as determinative parameters, and the coefficients of the determinants give meaningful results. As a result, ANN model has been shown to be more successful than traditional regression model.Keywords: railway load carrying, neural network, modelling transport, transportation
Procedia PDF Downloads 14318720 Estimation of Functional Response Model by Supervised Functional Principal Component Analysis
Authors: Hyon I. Paek, Sang Rim Kim, Hyon A. Ryu
Abstract:
In functional linear regression, one typical problem is to reduce dimension. Compared with multivariate linear regression, functional linear regression is regarded as an infinite-dimensional case, and the main task is to reduce dimensions of functional response and functional predictors. One common approach is to adapt functional principal component analysis (FPCA) on functional predictors and then use a few leading functional principal components (FPC) to predict the functional model. The leading FPCs estimated by the typical FPCA explain a major variation of the functional predictor, but these leading FPCs may not be mostly correlated with the functional response, so they may not be significant in the prediction for response. In this paper, we propose a supervised functional principal component analysis method for a functional response model with FPCs obtained by considering the correlation of the functional response. Our method would have a better prediction accuracy than the typical FPCA method.Keywords: supervised, functional principal component analysis, functional response, functional linear regression
Procedia PDF Downloads 7518719 Modelling the Effect of Physical Environment Factors on Child Pedestrian Severity Collisions in Malaysia: A Multinomial Logistic Regression Analysis
Authors: Muhamad N. Borhan, Nur S. Darus, Siti Z. Ishak, Rozmi Ismail, Siti F. M. Razali
Abstract:
Children are at the greater risk to be involved in road traffic collisions due to the complex interaction of various elements in our transportation system. It encompasses interactions between the elements of children and driver behavior along with physical and social environment factors. The present study examined the effect between the collisions severity and physical environment factors on child pedestrian collisions. The severity of collisions is categorized into four injury outcomes: fatal, serious injury, slight injury, and damage. The sample size comprised of 2487 cases of child pedestrian-vehicle collisions in which children aged 7 to 12 years old was involved in Malaysia for the years 2006-2015. A multinomial logistic regression was applied to establish the effect between severity levels and physical environment factors. The results showed that eight contributing factors influence the probability of an injury road surface material, traffic system, road marking, control type, lighting condition, type of location, land use and road surface condition. Understanding the effect of physical environment factors may contribute to the improvement of physical environment design and decrease the collision involvement.Keywords: child pedestrian, collisions, primary school, road injuries
Procedia PDF Downloads 16418718 Giving Right-of-Way to Emergency Ambulances: Attitude and Behavior of Road Users in Developing Countries
Authors: Mahmoud T. Alwidyan, Ahmad Alrawashdeh, Alaa O. Oteir
Abstract:
Background: Emergency medical service (EMS) providers, oftentimes, use the lights and sirens (L&S) of their ambulances to warn road users, navigate through traffic, and expedite transport to save lives of ill and injured patients. Despite the contribution of road users in the effectiveness of reducing transport time of EMS ambulances using L&S, there is a lack of empirical assessments exploring the road user’s attitude and behavior in such situations. This study, therefore, aimed to assess the attitude and behavior of road users in response to EMS ambulances with warning L&S in use. Methods: This was a cross-sectional survey developed and distributed to adult road users in Northern Jordan. The questionnaire included 20 items addressing demographics, attitudes, and behavior toward emergency ambulances. We described the participants’ responses and assessed the association between demographics and attitude statements using logistic regression. Results: A total of 1302 questionnaires were complete and appropriate for analysis. The mean age was 34.2 (SD± 11.4) years, and the majority were males (72.6%). About half of road users (47.9%) in our sample would perform inappropriate action in response to EMS ambulances with L&S in use. The multivariate logistic regression model show that being female (OR, 0.63; 95% CI = 0.48-0.81), more educated (OR, 0.68; 95% CI = 0.53-0.86), or public transport driver (OR, 0.55; 95% CI = 0.34-0.90) is significantly associated with inappropriate response to EMS ambulances. Additionally, a significant proportion of road users may perform inappropriate and lawless driving practices such as crossing red traffic lights or following the passing by EMS ambulances, which would, in turn, increase the risk on ambulances and other road users. Conclusions: A large proportion of road users in Jordan may respond inappropriately to the EMS ambulances, and many engage in risky driving behaviors due perhaps to the lack of procedural knowledge. Policy-related interventions and educational programs are crucially needed to increase public awareness of the traffic law concerning EMS ambulances and to enhance appropriate driving behavior, which, in turn, improves the efficiency of ambulance services.Keywords: EMS ambulances, lights and sirens, road users, attitude and behavior
Procedia PDF Downloads 8818717 A Comparative Study on Sampling Techniques of Polynomial Regression Model Based Stochastic Free Vibration of Composite Plates
Authors: S. Dey, T. Mukhopadhyay, S. Adhikari
Abstract:
This paper presents an exhaustive comparative investigation on sampling techniques of polynomial regression model based stochastic natural frequency of composite plates. Both individual and combined variations of input parameters are considered to map the computational time and accuracy of each modelling techniques. The finite element formulation of composites is capable to deal with both correlated and uncorrelated random input variables such as fibre parameters and material properties. The results obtained by Polynomial regression (PR) using different sampling techniques are compared. Depending on the suitability of sampling techniques such as 2k Factorial designs, Central composite design, A-Optimal design, I-Optimal, D-Optimal, Taguchi’s orthogonal array design, Box-Behnken design, Latin hypercube sampling, sobol sequence are illustrated. Statistical analysis of the first three natural frequencies is presented to compare the results and its performance.Keywords: composite plate, natural frequency, polynomial regression model, sampling technique, uncertainty quantification
Procedia PDF Downloads 51318716 Charting Sentiments with Naive Bayes and Logistic Regression
Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri
Abstract:
The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.Keywords: machine learning, sentiment analysis, visualisation, python
Procedia PDF Downloads 5618715 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression
Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr
Abstract:
Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.Keywords: design of experiments, regression analysis, SI engine, statistical modeling
Procedia PDF Downloads 18618714 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course
Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu
Abstract:
This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN
Procedia PDF Downloads 44