Search results for: logistic regression models
9318 A Study on Characteristics of Hedonic Price Models in Korea Based on Meta-Regression Analysis
Authors: Minseo Jo
Abstract:
The purpose of this paper is to examine the factors in the hedonic price models, that has significance impact in determining the price of apartments. There are many variables employed in the hedonic price models and their effectiveness vary differently according to the researchers and the regions they are analysing. In order to consider various conditions, the meta-regression analysis has been selected for the study. In this paper, four meta-independent variables, from the 65 hedonic price models to analysis. The factors that influence the prices of apartments, as well as including factors that influence the prices of apartments, regions, which are divided into two of the research performed, years of research performed, the coefficients of the functions employed. The covariance between the four meta-variables and p-value of the coefficients and the four meta-variables and number of data used in the 65 hedonic price models have been analyzed in this study. The six factors that are most important in deciding the prices of apartments are positioning of apartments, the noise of the apartments, points of the compass and views from the apartments, proximity to the public transportations, companies that have constructed the apartments, social environments (such as schools etc.).Keywords: hedonic price model, housing price, meta-regression analysis, characteristics
Procedia PDF Downloads 4029317 Breast Cancer Detection Using Machine Learning Algorithms
Authors: Jiwan Kumar, Pooja, Sandeep Negi, Anjum Rouf, Amit Kumar, Naveen Lakra
Abstract:
In modern times where, health issues are increasing day by day, breast cancer is also one of them, which is very crucial and really important to find in the early stages. Doctors can use this model in order to tell their patients whether a cancer is not harmful (benign) or harmful (malignant). We have used the knowledge of machine learning in order to produce the model. we have used algorithms like Logistic Regression, Random forest, support Vector Classifier, Bayesian Network and Radial Basis Function. We tried to use the data of crucial parts and show them the results in pictures in order to make it easier for doctors. By doing this, we're making ML better at finding breast cancer, which can lead to saving more lives and better health care.Keywords: Bayesian network, radial basis function, ensemble learning, understandable, data making better, random forest, logistic regression, breast cancer
Procedia PDF Downloads 529316 Lean Implementation Analysis on the Safety Performance of Construction Projects in the Philippines
Authors: Kim Lindsay F. Restua, Jeehan Kyra A. Rivero, Joneka Myles D. Taguba
Abstract:
Lean construction is defined as an approach in construction with the purpose of reducing waste in the process without compromising the value of the project. There are numerous lean construction tools that are applied in the construction process, which maximizes the efficiency of work and satisfaction of customers while minimizing waste. However, the complexity and differences of construction projects cause a rise in challenges on achieving the lean benefits construction can give, such as improvement in safety performance. The objective of this study is to determine the relationship between lean construction tools and their effects on safety performance. The relationship between construction tools applied in construction and safety performance is identified through Logistic Regression Analysis, and Correlation Analysis was conducted thereafter. Based on the findings, it was concluded that almost 60% of the factors listed in the study, which are different tools and effects of lean construction, were determined to have a significant relationship with the level of safety in construction projects.Keywords: correlation analysis, lean construction tools, lean construction, logistic regression analysis, risk management, safety
Procedia PDF Downloads 1869315 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling
Authors: Kisan Sarda, Bhavika Shingote
Abstract:
This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation
Procedia PDF Downloads 3819314 Efficient Estimation for the Cox Proportional Hazards Cure Model
Authors: Khandoker Akib Mohammad
Abstract:
While analyzing time-to-event data, it is possible that a certain fraction of subjects will never experience the event of interest, and they are said to be cured. When this feature of survival models is taken into account, the models are commonly referred to as cure models. In the presence of covariates, the conditional survival function of the population can be modelled by using the cure model, which depends on the probability of being uncured (incidence) and the conditional survival function of the uncured subjects (latency), and a combination of logistic regression and Cox proportional hazards (PH) regression is used to model the incidence and latency respectively. In this paper, we have shown the asymptotic normality of the profile likelihood estimator via asymptotic expansion of the profile likelihood and obtain the explicit form of the variance estimator with an implicit function in the profile likelihood. We have also shown the efficient score function based on projection theory and the profile likelihood score function are equal. Our contribution in this paper is that we have expressed the efficient information matrix as the variance of the profile likelihood score function. A simulation study suggests that the estimated standard errors from bootstrap samples (SMCURE package) and the profile likelihood score function (our approach) are providing similar and comparable results. The numerical result of our proposed method is also shown by using the melanoma data from SMCURE R-package, and we compare the results with the output obtained from the SMCURE package.Keywords: Cox PH model, cure model, efficient score function, EM algorithm, implicit function, profile likelihood
Procedia PDF Downloads 1439313 Lifestyle Factors Associated With Overweight/obesity Status In Croatian Adolescents: A Population-Based Study
Authors: Lovro Štefan
Abstract:
The main purpose of the present study was to investigate the associations between the overweight/obesity status and lifestyle factors. In this cross-sectional study, participants were 1950 urban secondary-school students (54.7% of female students) aged 17-18 years old. Dependent variable was body-mass index status derived from self-reported height and weight. The outcome was binarised, where participants with value <25 kg/m2 were collapsed into „normal“, while those ≥25 kg/m2 into „overweight/obesity“ category. Independent variables were gender, type of school, physical activity, sedentary behaviour, self-rated health, self-perceived socioeconomic status and psychological distress. The associations between the dependent and independent variables were analyzed by using multiple logistic regression analysis. In the univariate model, being overweight/obese was significantly associated with being a male student (OR 0.31; 95% CI 0.23 to 0.42), attending a vocational school (OR 1.87; 95% CI 1.42 to 2.48), not meeting the recommendations for moderate-to-vigorous physical activity (OR 0.44; 95% CI 0.22 to 0.88), more time spending in sedentary behaviour (OR 1.53; 95% CI 1.07 to 2.19), poor self-rated health (OR 0.35, 95% CI 0.20 to 0.56) and lower socioeconomic status (OR 0.63; 95% CI 0.48 to 0.84). In the multivariate model, the same associations occured between the dependent and independent variable. In both models, psychological distress was not associated with being overweight/obese. In conclusion, our findings suggest, that lifestyle factors are independently associated with body-mass indexKeywords: body mass index, secondary-school students, Croatia, physical activity, sedentary behaviour, logistic regression
Procedia PDF Downloads 899312 Heart Ailment Prediction Using Machine Learning Methods
Authors: Abhigyan Hedau, Priya Shelke, Riddhi Mirajkar, Shreyash Chaple, Mrunali Gadekar, Himanshu Akula
Abstract:
The heart is the coordinating centre of the major endocrine glandular structure of the body, which produces hormones that profoundly affect the operations of the body, and diagnosing cardiovascular disease is a difficult but critical task. By extracting knowledge and information about the disease from patient data, data mining is a more practical technique to help doctors detect disorders. We use a variety of machine learning methods here, including logistic regression and support vector classifiers (SVC), K-nearest neighbours Classifiers (KNN), Decision Tree Classifiers, Random Forest classifiers and Gradient Boosting classifiers. These algorithms are applied to patient data containing 13 different factors to build a system that predicts heart disease in less time with more accuracy.Keywords: logistic regression, support vector classifier, k-nearest neighbour, decision tree, random forest and gradient boosting
Procedia PDF Downloads 509311 A Multinomial Logistic Regression Analysis of Factors Influencing Couples' Fertility Preferences in Kenya
Authors: Naomi W. Maina
Abstract:
Fertility preference is a subject of great significance in developing countries. Studies reveal that the preferences of fertility are actually significant in determining the society’s fertility levels because the fertility behavior of the future has a high likelihood of falling under the effect of currently observed fertility inclinations. The objective of this study was to establish the factors associated with fertility preference amongst couples in Kenya by fitting a multinomial logistic regression model against 5,265 couple data obtained from Kenya demographic health survey 2014. Results revealed that the type of place of residence, the region of residence, age and spousal age gap significantly influence desire for additional children among couples in Kenya. There was the notable high likelihood of couples living in rural settlements having similar fertility preference compared to those living in urban settlements. Moreover, geographical disparities such as in northern Kenya revealed significant differences in a couples desire to have additional children compared to Nairobi. The odds of a couple’s desire for additional children were further observed to vary dependent on either the wife or husbands age and to a large extent the spousal age gap. Evidenced from the study, was the fact that as spousal age gap increases, the desire for more children amongst couples decreases. Insights derived from this study would be attractive to demographers, health practitioners, policymakers, and non-governmental organizations implementing fertility related interventions in Kenya among other stakeholders. Moreover, with the adoption of devolution, there is a clear need for adoption of population policies that are County specific as opposed to a national population policy as is the current practice in Kenya. Additionally, researchers or students who have little understanding in the application of multinomial logistic regression, both theoretical understanding and practical analysis in SPSS as well as application on real datasets, will find this article useful.Keywords: couples' desire, fertility, fertility preference, multinomial regression analysis
Procedia PDF Downloads 1819310 Machine Learning Approach for Stress Detection Using Wireless Physical Activity Tracker
Authors: B. Padmaja, V. V. Rama Prasad, K. V. N. Sunitha, E. Krishna Rao Patro
Abstract:
Stress is a psychological condition that reduces the quality of sleep and affects every facet of life. Constant exposure to stress is detrimental not only for mind but also body. Nevertheless, to cope with stress, one should first identify it. This paper provides an effective method for the cognitive stress level detection by using data provided from a physical activity tracker device Fitbit. This device gathers people’s daily activities of food, weight, sleep, heart rate, and physical activities. In this paper, four major stressors like physical activities, sleep patterns, working hours and change in heart rate are used to assess the stress levels of individuals. The main motive of this system is to use machine learning approach in stress detection with the help of Smartphone sensor technology. Individually, the effect of each stressor is evaluated using logistic regression and then combined model is built and assessed using variants of ordinal logistic regression models like logit, probit and complementary log-log. Then the quality of each model is evaluated using Akaike Information Criterion (AIC) and probit is assessed as the more suitable model for our dataset. This system is experimented and evaluated in a real time environment by taking data from adults working in IT and other sectors in India. The novelty of this work lies in the fact that stress detection system should be less invasive as possible for the users.Keywords: physical activity tracker, sleep pattern, working hours, heart rate, smartphone sensor
Procedia PDF Downloads 2569309 Developing a Cybernetic Model of Interdepartmental Logistic Interactions in SME
Authors: Jonas Mayer, Kai-Frederic Seitz, Thorben Kuprat
Abstract:
In today’s competitive environment production’s logistic objectives such as ‘delivery reliability’ and ‘delivery time’ and distribution’s logistic objectives such as ‘service level’ and ‘delivery delay’ are attributed great importance. Especially for small and mid-sized enterprises (SME) attaining these objectives pose a key challenge. Within this context, one of the difficulties is that interactions between departments within the enterprise and their specific objectives are insufficiently taken into account and aligned. Interdepartmental independencies along with contradicting targets set within the different departments result in enterprises having sub-optimal logistic performance capability. This paper presents a research project which will systematically describe the interactions between departments and convert them into a quantifiable form.Keywords: department-specific actuating and control variables, interdepartmental interactions, cybernetic model, logistic objectives
Procedia PDF Downloads 3729308 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography
Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević
Abstract:
This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis
Procedia PDF Downloads 3939307 Behind Fuzzy Regression Approach: An Exploration Study
Authors: Lavinia B. Dulla
Abstract:
The exploration study of the fuzzy regression approach attempts to present that fuzzy regression can be used as a possible alternative to classical regression. It likewise seeks to assess the differences and characteristics of simple linear regression and fuzzy regression using the width of prediction interval, mean absolute deviation, and variance of residuals. Based on the simple linear regression model, the fuzzy regression approach is worth considering as an alternative to simple linear regression when the sample size is between 10 and 20. As the sample size increases, the fuzzy regression approach is not applicable to use since the assumption regarding large sample size is already operating within the framework of simple linear regression. Nonetheless, it can be suggested for a practical alternative when decisions often have to be made on the basis of small data.Keywords: fuzzy regression approach, minimum fuzziness criterion, interval regression, prediction interval
Procedia PDF Downloads 2989306 Assessment of Pastoralist-Crop Farmers Conflict and Food Security of Farming Households in Kwara State, Nigeria
Authors: S. A. Salau, I. F. Ayanda, I. Afe, M. O. Adesina, N. B. Nofiu
Abstract:
Food insecurity is still a critical challenge among rural and urban households in Nigeria. The country’s food insecurity situation became more pronounced due to frequent conflict between pastoralist and crop farmers. Thus, this study assesses pastoralist-crop farmers’ conflict and food security of farming households in Kwara state, Nigeria. The specific objectives are to measure the food security status of the respondents, quantify pastoralist- crop farmers’ conflict, determine the effect of pastoralist- crop farmers conflict on food security and describe the effective coping strategies adopted by the respondents to reduce the effect of food insecurity. A combination of purposive and simple random sampling techniques will be used to select 250 farming households for the study. The analytical tools include descriptive statistics, Likert-scale, logistic regression, and food security index. Using the food security index approach, the percentage of households that were food secure and insecure will be known. Pastoralist- crop farmers’ conflict will be measured empirically by quantifying loses due to the conflict. The logistic regression will indicate if pastoralist- crop farmers’ conflict is a critical determinant of food security among farming households in the study area. The coping strategies employed by the respondents in cushioning the effects of food insecurity will also be revealed. Empirical studies on the effect of pastoralist- crop farmers’ conflict on food security are rare in the literature. This study will quantify conflict and reveal the direction as well as the extent of the relationship between conflict and food security. It could contribute to the identification and formulation of strategies for the minimization of conflict among pastoralist and crop farmers in an attempt to reduce food insecurity. Moreover, this study could serve as valuable reference material for future researches and open up new areas for further researches.Keywords: agriculture, conflict, coping strategies, food security, logistic regression
Procedia PDF Downloads 1919305 Landslide Susceptibility Mapping Using Soft Computing in Amhara Saint
Authors: Semachew M. Kassa, Africa M Geremew, Tezera F. Azmatch, Nandyala Darga Kumar
Abstract:
Frequency ratio (FR) and analytical hierarchy process (AHP) methods are developed based on past landslide failure points to identify the landslide susceptibility mapping because landslides can seriously harm both the environment and society. However, it is still difficult to select the most efficient method and correctly identify the main driving factors for particular regions. In this study, we used fourteen landslide conditioning factors (LCFs) and five soft computing algorithms, including Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), Artificial Neural Network (ANN), and Naïve Bayes (NB), to predict the landslide susceptibility at 12.5 m spatial scale. The performance of the RF (F1-score: 0.88, AUC: 0.94), ANN (F1-score: 0.85, AUC: 0.92), and SVM (F1-score: 0.82, AUC: 0.86) methods was significantly better than the LR (F1-score: 0.75, AUC: 0.76) and NB (F1-score: 0.73, AUC: 0.75) method, according to the classification results based on inventory landslide points. The findings also showed that around 35% of the study region was made up of places with high and very high landslide risk (susceptibility greater than 0.5). The very high-risk locations were primarily found in the western and southeastern regions, and all five models showed good agreement and similar geographic distribution patterns in landslide susceptibility. The towns with the highest landslide risk include Amhara Saint Town's western part, the Northern part, and St. Gebreal Church villages, with mean susceptibility values greater than 0.5. However, rainfall, distance to road, and slope were typically among the top leading factors for most villages. The primary contributing factors to landslide vulnerability were slightly varied for the five models. Decision-makers and policy planners can use the information from our study to make informed decisions and establish policies. It also suggests that various places should take different safeguards to reduce or prevent serious damage from landslide events.Keywords: artificial neural network, logistic regression, landslide susceptibility, naïve Bayes, random forest, support vector machine
Procedia PDF Downloads 829304 Efficient Management of Construction Logistics: A Challenge to Both Conventional and Technological Systems in the Developing Nations
Authors: Nuruddeen Usman, Ahmad Muhammad Ibrahim
Abstract:
Management of construction logistics at construction sites becomes increasingly complex with rising construction volume, which made it relatively inefficient in the developing nations even with the technological advancement. The objective of this research is to conceptually synthesise the approaches and challenges befall in the course of construction logistic management, with the aim to proffer possible solution to it. Therefore, this study appraised the glitches associated with both conventional and technological methods of construction logistic management that result in its inefficiency. Thus, this investigation found that, both conventional and the technological issues were due to certain obstacles that affect the construction logistic management which resulted into delays, accidents, fraudulent activities, time and cost overrun. Therefore, this study has developed a framework that might bring a lasting solution to the challenges of construction logistic management.Keywords: construction, conventional, logistic, technological
Procedia PDF Downloads 5549303 Modeling of the Effect of Explosives, Geological and Geotechnical Parameters on the Stability of Rock Masses Case of Marrakech: Agadir Highway, Morocco
Authors: Taoufik Benchelha, Toufik Remmal, Rachid El Hamdouni, Hamou Mansouri, Houssein Ejjaouani, Halima Jounaid, Said Benchelha
Abstract:
During the earthworks for the construction of Marrakech-Agadir highway in southern Morocco, which crosses mountainous areas of the High Western Atlas, the main problem faced is the stability of the slopes. Indeed, the use of explosives as a means of excavation associated with the geological structure of the terrain encountered can trigger major ruptures and cause damage which depends on the intrinsic characteristics of the rock mass. The study consists of a geological and geotechnical analysis of several unstable zones located along the route, mobilizing millions of cubic meters of rock, with deduction of the parameters influencing slope stability. From this analysis, a predictive model for rock mass stability is carried out, based on a statistic method of logistic regression, in order to predict the geomechanical behavior of the rock slopes constrained by earthworks.Keywords: explosive, logistic regression, rock mass, slope stability
Procedia PDF Downloads 3769302 Examining Bulling Rates among Youth with Intellectual Disabilities
Authors: Kaycee L. Bills
Abstract:
Adolescents and youth who are members of a minority group are more likely to experience higher rates of bullying in comparison to other student demographics. Specifically, adolescents with intellectual disabilities are a minority population that is more susceptible to experience unfair treatment in social settings. This study employs the 2015 Wave of the National Crime Victimization Survey – School Crime Supplement (NCVS/SCS) longitudinal dataset to explore bullying rates experienced among adolescents with intellectual disabilities. This study uses chi-square testing and a logistic regression to analyze if having a disability influences the likelihood of being bullied in comparison to other student demographics. Results of the chi-square testing and the logistic regression indicate that adolescent students who were identified as having a disability were approximately four times more likely to experience higher bullying rates in comparison to all other majority and minority student populations. Thus, it means having a disability resulted in higher bullying rates in comparison to all student groups.Keywords: disability, bullying, social work, school bullying
Procedia PDF Downloads 1319301 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum
Authors: Abdulrahman Sumayli, Saad M. AlShahrani
Abstract:
For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectivelyKeywords: temperature, pressure variations, machine learning, oil treatment
Procedia PDF Downloads 699300 Gender Estimation by Means of Quantitative Measurements of Foramen Magnum: An Analysis of CT Head Images
Authors: Thilini Hathurusinghe, Uthpalie Siriwardhana, W. M. Ediri Arachchi, Ranga Thudugala, Indeewari Herath, Gayani Senanayake
Abstract:
The foramen magnum is more prone to protect than other skeletal remains during high impact and severe disruptive injuries. Therefore, it is worthwhile to explore whether these measurements can be used to determine the human gender which is vital in forensic and anthropological studies. The idea was to find out the ability to use quantitative measurements of foramen magnum as an anatomical indicator for human gender estimation and to evaluate the gender-dependent variations of foramen magnum using quantitative measurements. Randomly selected 113 subjects who underwent CT head scans at Sri Jayawardhanapura General Hospital of Sri Lanka within a period of six months, were included in the study. The sample contained 58 males (48.76 ± 14.7 years old) and 55 females (47.04 ±15.9 years old). Maximum length of the foramen magnum (LFM), maximum width of the foramen magnum (WFM), minimum distance between occipital condyles (MnD) and maximum interior distance between occipital condyles (MxID) were measured. Further, AreaT and AreaR were also calculated. The gender was estimated using binomial logistic regression. The mean values of all explanatory variables (LFM, WFM, MnD, MxID, AreaT, and AreaR) were greater among male than female. All explanatory variables except MnD (p=0.669) were statistically significant (p < 0.05). Significant bivariate correlations were demonstrated by AreaT and AreaR with the explanatory variables. The results evidenced that WFM and MxID were the best measurements in predicting gender according to binomial logistic regression. The estimated model was: log (p/1-p) =10.391-0.136×MxID-0.231×WFM, where p is the probability of being a female. The classification accuracy given by the above model was 65.5%. The quantitative measurements of foramen magnum can be used as a reliable anatomical marker for human gender estimation in the Sri Lankan context.Keywords: foramen magnum, forensic and anthropological studies, gender estimation, logistic regression
Procedia PDF Downloads 1519299 A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius
Authors: Mina Adel Shokry Fahim, Jūratė Sužiedelytė Visockienė
Abstract:
With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realisation often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.Keywords: air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter
Procedia PDF Downloads 539298 Multiple Linear Regression for Rapid Estimation of Subsurface Resistivity from Apparent Resistivity Measurements
Authors: Sabiu Bala Muhammad, Rosli Saad
Abstract:
Multiple linear regression (MLR) models for fast estimation of true subsurface resistivity from apparent resistivity field measurements are developed and assessed in this study. The parameters investigated were apparent resistivity (ρₐ), horizontal location (X) and depth (Z) of measurement as the independent variables; and true resistivity (ρₜ) as the dependent variable. To achieve linearity in both resistivity variables, datasets were first transformed into logarithmic domain following diagnostic checks of normality of the dependent variable and heteroscedasticity to ensure accurate models. Four MLR models were developed based on hierarchical combination of the independent variables. The generated MLR coefficients were applied to another data set to estimate ρₜ values for validation. Contours of the estimated ρₜ values were plotted and compared to the observed data plots at the colour scale and blanking for visual assessment. The accuracy of the models was assessed using coefficient of determination (R²), standard error (SE) and weighted mean absolute percentage error (wMAPE). It is concluded that the MLR models can estimate ρₜ for with high level of accuracy.Keywords: apparent resistivity, depth, horizontal location, multiple linear regression, true resistivity
Procedia PDF Downloads 2769297 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector
Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh
Abstract:
A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score
Procedia PDF Downloads 1349296 Food Insecurity Assessment, Consumption Pattern and Implications of Integrated Food Security Phase Classification: Evidence from Sudan
Authors: Ahmed A. A. Fadol, Guangji Tong, Wlaa Mohamed
Abstract:
This paper provides a comprehensive analysis of food insecurity in Sudan, focusing on consumption patterns and their implications, employing the Integrated Food Security Phase Classification (IPC) assessment framework. Years of conflict and economic instability have driven large segments of the population in Sudan into crisis levels of acute food insecurity according to the (IPC). A substantial number of people are estimated to currently face emergency conditions, with an additional sizeable portion categorized under less severe but still extreme hunger levels. In this study, we explore the multifaceted nature of food insecurity in Sudan, considering its historical, political, economic, and social dimensions. An analysis of consumption patterns and trends was conducted, taking into account cultural influences, dietary shifts, and demographic changes. Furthermore, we employ logistic regression and random forest analysis to identify significant independent variables influencing food security status in Sudan. Random forest clearly outperforms logistic regression in terms of area under curve (AUC), accuracy, precision and recall. Forward projections of the IPC for Sudan estimate that 15 million individuals are anticipated to face Crisis level (IPC Phase 3) or worse acute food insecurity conditions between October 2023 and February 2024. Of this, 60% are concentrated in Greater Darfur, Greater Kordofan, and Khartoum State, with Greater Darfur alone representing 29% of this total. These findings emphasize the urgent need for both short-term humanitarian aid and long-term strategies to address Sudan's deepening food insecurity crisis.Keywords: food insecurity, consumption patterns, logistic regression, random forest analysis
Procedia PDF Downloads 739295 Proportion and Factors Associated with Presumptive Tuberculosis among Suspected Pediatric Tuberculosis Patients
Authors: Naima Nur, Safa Islam, Saeema Islam, Md. Faridul Alam
Abstract:
Background: The worldwide increase in pediatric presumptive tuberculosis (TB) is the most life-threatening challenge in effectively controlling TB. The objective of this study was to determine the proportion of presumptive TB and the factors associated with it. Methods: A cross-sectional study was conducted between March and November 2013 at ICDDR-Bangladesh. Two hundred twelve pulmonary and extra-pulmonary specimens were collected from 84 suspected pediatric patients diagnosed with TB based on their clinical symptoms/radiological findings. Presumptive TB and confirmed TB were considered presumptive TB and non-presumptive TB and were isolated by smear-microscopy, culture, and GeneXpert. Logistic regression was used to analyze associations between outcome and predictor variables. Results: The proportion of presumptive TB was 85.7%, and 14.3% of non-presumptive TB. In presumptive TB, vaccine scars, family TB history, and school-going children were 16.6%, 33.3%, and 56.9%, respectively. In contrast, vaccine scars and family TB history were 8.3%, and school-going children were 58.3% in non-presumptive TB. Significant factors did not appear in the logistic regression analysis. Conclusion: Despite the high proportion of presumptive TB, there was no statistically significant between presumptive TB and non-presumptive TB.Keywords: presumptive tuberculosis, confirmed tuberculosis, patient's characteristics, diagnosis
Procedia PDF Downloads 499294 Agriculture Yield Prediction Using Predictive Analytic Techniques
Authors: Nagini Sabbineni, Rajini T. V. Kanth, B. V. Kiranmayee
Abstract:
India’s economy primarily depends on agriculture yield growth and their allied agro industry products. The agriculture yield prediction is the toughest task for agricultural departments across the globe. The agriculture yield depends on various factors. Particularly countries like India, majority of agriculture growth depends on rain water, which is highly unpredictable. Agriculture growth depends on different parameters, namely Water, Nitrogen, Weather, Soil characteristics, Crop rotation, Soil moisture, Surface temperature and Rain water etc. In our paper, lot of Explorative Data Analysis is done and various predictive models were designed. Further various regression models like Linear, Multiple Linear, Non-linear models are tested for the effective prediction or the forecast of the agriculture yield for various crops in Andhra Pradesh and Telangana states.Keywords: agriculture yield growth, agriculture yield prediction, explorative data analysis, predictive models, regression models
Procedia PDF Downloads 3149293 Determinants of Child Anthropometric Indicators: A Case Study of Mali in 2015
Authors: Davod Ahmadigheidari
Abstract:
The main objective of this study was to explore prevalence of anthropometric indicators as well the factors associated with the anthropometric indications in Mali. Data on 2015, downloaded from the website of Unicef, were analyzed. A total of 16,467 women (ages 15-49 years) and 16,467 children (ages 0-59 months) were selected for the sample. Different statistical analyses, such as descriptive, crosstabs and binary logistic regression form the basis of this study. Child anthropometric indicators (i.e., wasting, stunting, underweight and BMI for age) were used as the dependent variables. SPSS Syntax from WHO was used to create anthropometric indicators. Different factors, such as child’s sex, child’s age groups, child’s diseases symptoms (i.e., diarrhea, cough and fever), maternal education, household wealth index and area of residence were used as independent variables. Results showed more than forty percent of Malian households were in nutritional crises (stunting (42%) and underweight (34%). Findings from logistic regression analyses indicated that low score of wealth index, low maternal education and experience of diarrhea in last two weeks increase the probability of child malnutrition.Keywords: Mali, wasting, stunting, underweight, BMI for age and wealth index
Procedia PDF Downloads 1559292 Evaluating the Logistic Performance Capability of Regeneration Processes
Authors: Thorben Kuprat, Julian Becker, Jonas Mayer, Peter Nyhuis
Abstract:
For years now, it has been recognized that logistic performance capability contributes enormously to a production enterprise’s competitiveness and as such is a critical control lever. In doing so, the orientation on customer wishes (e.g. delivery dates) represents a key parameter not only in the value-adding production but also in product regeneration. Since production and regeneration processes have different characteristics, production planning and control measures cannot be directly transferred to regeneration processes. As part of a special research project, the Institute of Production Systems and Logistics Hannover is focused on increasing the logistic performance capability of regeneration processes for complex capital goods. The aim is to ensure logistic targets are met by implementing a model specifically designed to align the capacities and load in regeneration processes.Keywords: capacity planning, complex capital goods, logistic performance, regeneration process
Procedia PDF Downloads 4899291 A Five-Year Follow-up Survey Using Regression Analysis Finds Only Maternal Age to Be a Significant Medical Predictor for Infertility Treatment
Authors: Lea Stein, Sabine Rösner, Alessandra Lo Giudice, Beate Ditzen, Tewes Wischmann
Abstract:
For many couples bearing children is a consistent life goal; however, it cannot always be fulfilled. Undergoing infertility treatment does not guarantee pregnancies and live births. Couples have to deal with miscarriages and sometimes even discontinue infertility treatment. Significant medical predictors for the outcome of infertility treatment have yet to be fully identified. To further our understanding, a cross-sectional five-year follow-up survey was undertaken, in which 95 women and 82 men that have been treated at the Women’s Hospital of Heidelberg University participated. Binary logistic regressions, parametric and non-parametric methods were used for our sample to determine the relevance of biological (infertility diagnoses, maternal and paternal age) and lifestyle factors (smoking, drinking, over- and underweight) on the outcome of infertility treatment (clinical pregnancy, live birth, miscarriage, dropout rate). During infertility treatment, 72.6% of couples became pregnant and 69.5% were able to give birth. Suffering from miscarriages 27.5% of couples and 20.5% decided to discontinue an unsuccessful fertility treatment. The binary logistic regression models for clinical pregnancies, live births and dropouts were statistically significant for the maternal age, whereas the paternal age in addition to maternal and paternal BMI, smoking, infertility diagnoses and infections, showed no significant predicting effect on any of the outcome variables. The results confirm an effect of maternal age on infertility treatment, whereas the relevance of other medical predictors remains unclear. Further investigations should be considered to increase our knowledge of medical predictors.Keywords: advanced maternal age, assisted reproductive technology, female factor, male factor, medical predictors, infertility treatment, reproductive medicine
Procedia PDF Downloads 1099290 Food Insecurity Determinants Amidst the Covid-19 Pandemic: An Insight from Huntsville, Texas
Authors: Peter Temitope Agboola
Abstract:
Food insecurity continues to affect a large number of U.S households during this coronavirus COVID-19 pandemic. The pandemic has threatened the livelihoods of people, making them vulnerable to severe hardship and has had an unanticipated impact on the U.S economy. This study attempts to identify the food insecurity status of households and the determinant factors driving household food insecurity. Additionally, it attempts to discover the mitigation measures adopted by households during the pandemic in the city of Huntsville, Texas. A structured online sample survey was used to collect data, with a household expenditures survey used in evaluating the food security status of the household. Most survey respondents disclosed that the COVID-19 pandemic had affected their life and source of income. Furthermore, the main analytical tool used for the study is descriptive statistics and logistic regression modeling. A logistic regression model was used to determine the factors responsible for food insecurity in the study area. The result revealed that most households in the study area are food secure, with the remainder being food insecure.Keywords: food insecurity, household expenditure survey, COVID-19, coping strategies, food pantry
Procedia PDF Downloads 2099289 Using the Bootstrap for Problems Statistics
Authors: Brahim Boukabcha, Amar Rebbouh
Abstract:
The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models
Procedia PDF Downloads 380