Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33

Logistic Regression Related Abstracts

33 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast

Authors: Zalinda Othman, Abdul Razak Hamdan, Azuraliza Abu Bakar, Nabilah Filzah Mohd Radzuan, Andi Putra

Abstract:

Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.

Keywords: Logistic Regression, decisions tree, random forest, VAR model

Procedia PDF Downloads 240
32 Organic Farming Profitability: Evidence from South Korea

Authors: Thanh Nguyen, Saem Lee, Hio-Jung Shin, Thomas Koellner

Abstract:

Land-use management has an influence on the provision of ecosystem service in dynamic, agricultural landscapes. Agricultural land use is important for maintaining the productivity and sustainability of agricultural ecosystems. However, in Korea, intensive farming activities in this highland agricultural zone, the upper stream of Soyang has led to contaminated soil caused by over-use pesticides and fertilizers. This has led to decrease in water and soil quality, which has consequences for ecosystem services and human wellbeing. Conventional farming has still high percentage in this area and there is no special measure to prevent low water quality caused by farming activities. Therefore, the adoption of environmentally friendly farming has been considered one of the alternatives that lead to improved water quality and increase in biomass production. Concurrently, farm households with environmentally friendly farming have occupied still low rates. Therefore, our research involved a farm household survey spanning conventional farming, the farm in transition and organic farming in Soyang watershed. Another purpose of our research was to compare economic advantage of the farmers adopting environmentally friendly farming and non-adaptors and to investigate the different factors by logistic regression analysis with socio-economic and benefit-cost ratio variables. The results found that farmers with environmentally friendly farming tended to be younger than conventional farming and farmer in transition. They are similar in terms of gender which was predominately male. Farmers with environmentally friendly farming were more educated and had less farming experience than conventional farming and farmer in transition. Based on the benefit-cost analysis, total costs that farm in transition farmers spent for one year are about two times as much as the sum of costs in environmentally friendly farming. The benefit of organic farmers was assessed with 2,800 KRW per household per year. In logistic regression, the factors having statistical significance are subsidy and district, residence period and benefit-cost ratio. And district and residence period have the negative impact on the practice of environmentally friendly farming techniques. The results of our research make a valuable contribution to provide important information to describe Korean policy-making for agricultural and water management and to consider potential approaches to policy that would substantiate ways beneficial for sustainable resource management.

Keywords: Profitability, Organic Farming, Logistic Regression, agricultural land-use

Procedia PDF Downloads 266
31 Determinants of Poverty: A Logit Regression Analysis of Zakat Applicants

Authors: Abd Halim Mohd Noor, Zunaidah Ab Hasan, Azhana Othman, Nor Shahrina Mohd Rafien

Abstract:

Zakat is a portion of wealth contributed from financially able Muslims to be distributed to predetermine recipients; main among them are the poor and the needy. Distribution of the zakat fund is given with the objective to lift the recipients from poverty. Due to the multidimensional and multifaceted nature of poverty, it is imperative that the causes of poverty are properly identified for assistance given by zakat authorities reached the intended target. Despite, various studies undertaken to identify the poor correctly, there are reports of the poor not receiving the adequate assistance required from zakat. Thus, this study examines the determinants of poverty among applicants for zakat assistance distributed by the State Islamic Religious Council in Malacca (SIRCM). Malacca is a state in Malaysia. The respondents were based on the list of names of new zakat applicants for the month of April and May 2014 provided by SIRCM. A binary logistic regression was estimated based on this data with either zakat applications is rejected or accepted as the dependent variable and set of demographic variables and health as the explanatory variables. Overall, the logistic model successfully predicted factors of acceptance of zakat applications. Three independent variables namely gender, age; size of households and health significantly explain the likelihood of a successful zakat application. Among others, the finding suggests the importance of focusing on providing education opportunity in helping the poor.

Keywords: Education, Poverty, Logistic Regression, zakat distribution, status of zakat applications

Procedia PDF Downloads 197
30 The Theory behind Logistic Regression

Authors: Jan Henrik Wosnitza

Abstract:

The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.

Keywords: correlation, Logistic Regression, default correlation, credit risk estimation, homoscedasticity, nonlinear logistic regression

Procedia PDF Downloads 228
29 Multiplying Vulnerability of Child Health Outcome and Food Diversity in India

Authors: Mukesh Ravi Raushan

Abstract:

Despite consideration of obesity as a deadly public health issue contributing 2.6 million deaths worldwide every year developing country like India is facing malnutrition and it is more common than in Sub-Saharan Africa. About one in every three malnourished children in the world lives in India. The paper assess the nutritional health among children using data from total number of 43737 infant and young children aged 0-59 months (µ = 29.54; SD = 17.21) of the selected households by National Family Health Survey, 2005-06. The wasting was measured by a Z-score of standardized weight-for-height according to the WHO child growth standards. The impact of education with place of residence was found to be significantly associated with the complementary food diversity score (CFDS) in India. The education of mother was positively associated with the CFDS but the degree of performance was lower in rural India than their counterpart from urban. The result of binary logistic regression on wasting with WHO seven types of recommended food for children in India suggest that child who consumed the milk product food (OR: 0.87, p<0.0001) were less likely to be malnourished than their counterparts who did not consume, whereas, in case of other food items as the child who consumed food product of seed (OR: 0.75, p<0.0001) were less likely to be malnourished than those who did not. The nutritional status among children were negatively associated with the protein containing complementary food given the child as those child who received pulse in last 24 hour were less likely to be wasted (OR: 0.87, p<0.00001) as compared to the reference categories. The frequency to feed the indexed child increases by 10 per cent the expected change in child health outcome in terms of wasting decreases by 2 per cent in India when place of residence, education, religion, and birth order were controlled. The index gets improved as the risk for malnutrition among children in India decreases.

Keywords: India, Logistic Regression, CFDS, food diversity index

Procedia PDF Downloads 159
28 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: coronary heart disease, Logistic Regression, prediction, Acute heart attacks

Procedia PDF Downloads 312
27 Generalized Additive Model for Estimating Propensity Score

Authors: Tahmidul Islam

Abstract:

Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.

Keywords: Logistic Regression, Accuracy, Non-Linearity, propensity score matching, covariate balances, generalized additive model

Procedia PDF Downloads 236
26 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Dong Hoon Lim, Byung Ho Jung

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: Big Data, Logistic Regression, MapReduce, RHadoop

Procedia PDF Downloads 160
25 Measuring Enterprise Growth: Pitfalls and Implications

Authors: N. Šarlija, S. Pfeifer, M. Jeger, A. Bilandžić

Abstract:

Enterprise growth is generally considered as a key driver of competitiveness, employment, economic development and social inclusion. As such, it is perceived to be a highly desirable outcome of entrepreneurship for scholars and decision makers. The huge academic debate resulted in the multitude of theoretical frameworks focused on explaining growth stages, determinants and future prospects. It has been widely accepted that enterprise growth is most likely nonlinear, temporal and related to the variety of factors which reflect the individual, firm, organizational, industry or environmental determinants of growth. However, factors that affect growth are not easily captured, instruments to measure those factors are often arbitrary, causality between variables and growth is elusive, indicating that growth is not easily modeled. Furthermore, in line with heterogeneous nature of the growth phenomenon, there is a vast number of measurement constructs assessing growth which are used interchangeably. Differences among various growth measures, at conceptual as well as at operationalization level, can hinder theory development which emphasizes the need for more empirically robust studies. In line with these highlights, the main purpose of this paper is twofold. Firstly, to compare structure and performance of three growth prediction models based on the main growth measures: Revenues, employment and assets growth. Secondly, to explore the prospects of financial indicators, set as exact, visible, standardized and accessible variables, to serve as determinants of enterprise growth. Finally, to contribute to the understanding of the implications on research results and recommendations for growth caused by different growth measures. The models include a range of financial indicators as lag determinants of the enterprises’ performances during the 2008-2013, extracted from the national register of the financial statements of SMEs in Croatia. The design and testing stage of the modeling used the logistic regression procedures. Findings confirm that growth prediction models based on different measures of growth have different set of predictors. Moreover, the relationship between particular predictors and growth measure is inconsistent, namely the same predictor positively related to one growth measure may exert negative effect on a different growth measure. Overall, financial indicators alone can serve as good proxy of growth and yield adequate predictive power of the models. The paper sheds light on both methodology and conceptual framework of enterprise growth by using a range of variables which serve as a proxy for the multitude of internal and external determinants, but are unlike them, accessible, available, exact and free of perceptual nuances in building up the model. Selection of the growth measure seems to have significant impact on the implications and recommendations related to growth. Furthermore, the paper points out to potential pitfalls of measuring and predicting growth. Overall, the results and the implications of the study are relevant for advancing academic debates on growth-related methodology, and can contribute to evidence-based decisions of policy makers.

Keywords: Logistic Regression, small and medium-sized enterprises, growth measurement constructs, prediction of growth potential

Procedia PDF Downloads 117
24 Modeling Geogenic Groundwater Contamination Risk with the Groundwater Assessment Platform (GAP)

Authors: Joel Podgorski, Manouchehr Amini, Annette Johnson, Michael Berg

Abstract:

One-third of the world’s population relies on groundwater for its drinking water. Natural geogenic arsenic and fluoride contaminate ~10% of wells. Prolonged exposure to high levels of arsenic can result in various internal cancers, while high levels of fluoride are responsible for the development of dental and crippling skeletal fluorosis. In poor urban and rural settings, the provision of drinking water free of geogenic contamination can be a major challenge. In order to efficiently apply limited resources in the testing of wells, water resource managers need to know where geogenically contaminated groundwater is likely to occur. The Groundwater Assessment Platform (GAP) fulfills this need by providing state-of-the-art global arsenic and fluoride contamination hazard maps as well as enabling users to create their own groundwater quality models. The global risk models were produced by logistic regression of arsenic and fluoride measurements using predictor variables of various soil, geological and climate parameters. The maps display the probability of encountering concentrations of arsenic or fluoride exceeding the World Health Organization’s (WHO) stipulated concentration limits of 10 µg/L or 1.5 mg/L, respectively. In addition to a reconsideration of the relevant geochemical settings, these second-generation maps represent a great improvement over the previous risk maps due to a significant increase in data quantity and resolution. For example, there is a 10-fold increase in the number of measured data points, and the resolution of predictor variables is generally 60 times greater. These same predictor variable datasets are available on the GAP platform for visualization as well as for use with a modeling tool. The latter requires that users upload their own concentration measurements and select the predictor variables that they wish to incorporate in their models. In addition, users can upload additional predictor variable datasets either as features or coverages. Such models can represent an improvement over the global models already supplied, since (a) users may be able to use their own, more detailed datasets of measured concentrations and (b) the various processes leading to arsenic and fluoride groundwater contamination can be isolated more effectively on a smaller scale, thereby resulting in a more accurate model. All maps, including user-created risk models, can be downloaded as PDFs. There is also the option to share data in a secure environment as well as the possibility to collaborate in a secure environment through the creation of communities. In summary, GAP provides users with the means to reliably and efficiently produce models specific to their region of interest by making available the latest datasets of predictor variables along with the necessary modeling infrastructure.

Keywords: Arsenic, Logistic Regression, Groundwater Contamination, fluoride

Procedia PDF Downloads 233
23 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan

Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou

Abstract:

This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.

Keywords: Logistic Regression, landslide susceptibility, instability index method, SRC curve

Procedia PDF Downloads 159
22 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: Machine Learning, Logistic Regression, support vector machine, nearest neighbour, SMOTE, imbalanced datasets

Procedia PDF Downloads 201
21 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: Neural Network, classification, Logistic Regression, Naïve Bayesian, fuzzy c-means, ROC curve

Procedia PDF Downloads 195
20 Detection Efficient Enterprises via Data Envelopment Analysis

Authors: S. Turkan

Abstract:

In this paper, the Turkey’s Top 500 Industrial Enterprises data in 2014 were analyzed by data envelopment analysis. Data envelopment analysis is used to detect efficient decision-making units such as universities, hospitals, schools etc. by using inputs and outputs. The decision-making units in this study are enterprises. To detect efficient enterprises, some financial ratios are determined as inputs and outputs. For this reason, financial indicators related to productivity of enterprises are considered. The efficient foreign weighted owned capital enterprises are detected via super efficiency model. According to the results, it is said that Mercedes-Benz is the most efficient foreign weighted owned capital enterprise in Turkey.

Keywords: Data Envelopment Analysis, Logistic Regression, financial ratios, super efficiency

Procedia PDF Downloads 167
19 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Emily Chia-Yu Su, Kuei-Ling Sun

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: Machine Learning, Allergy, classification, Logistic Regression, Decision Tree

Procedia PDF Downloads 169
18 Nuclear Fuel Safety Threshold Determined by Logistic Regression Plus Uncertainty

Authors: D. S. Gomes, A. T. Silva

Abstract:

Analysis of the uncertainty quantification related to nuclear safety margins applied to the nuclear reactor is an important concept to prevent future radioactive accidents. The nuclear fuel performance code may involve the tolerance level determined by traditional deterministic models producing acceptable results at burn cycles under 62 GWd/MTU. The behavior of nuclear fuel can simulate applying a series of material properties under irradiation and physics models to calculate the safety limits. In this study, theoretical predictions of nuclear fuel failure under transient conditions investigate extended radiation cycles at 75 GWd/MTU, considering the behavior of fuel rods in light-water reactors under reactivity accident conditions. The fuel pellet can melt due to the quick increase of reactivity during a transient. Large power excursions in the reactor are the subject of interest bringing to a treatment that is known as the Fuchs-Hansen model. The point kinetic neutron equations show similar characteristics of non-linear differential equations. In this investigation, the multivariate logistic regression is employed to a probabilistic forecast of fuel failure. A comparison of computational simulation and experimental results was acceptable. The experiments carried out use the pre-irradiated fuels rods subjected to a rapid energy pulse which exhibits the same behavior during a nuclear accident. The propagation of uncertainty utilizes the Wilk's formulation. The variables chosen as essential to failure prediction were the fuel burnup, the applied peak power, the pulse width, the oxidation layer thickness, and the cladding type.

Keywords: Logistic Regression, Uncertainty Propagation, reactivity-initiated accident, safety margins

Procedia PDF Downloads 179
17 Modeling of the Effect of Explosives, Geological and Geotechnical Parameters on the Stability of Rock Masses Case of Marrakech: Agadir Highway, Morocco

Authors: Taoufik Benchelha, Toufik Remmal, Rachid El Hamdouni, Hamou Mansouri, Houssein Ejjaouani, Halima Jounaid, Said Benchelha

Abstract:

During the earthworks for the construction of Marrakech-Agadir highway in southern Morocco, which crosses mountainous areas of the High Western Atlas, the main problem faced is the stability of the slopes. Indeed, the use of explosives as a means of excavation associated with the geological structure of the terrain encountered can trigger major ruptures and cause damage which depends on the intrinsic characteristics of the rock mass. The study consists of a geological and geotechnical analysis of several unstable zones located along the route, mobilizing millions of cubic meters of rock, with deduction of the parameters influencing slope stability. From this analysis, a predictive model for rock mass stability is carried out, based on a statistic method of logistic regression, in order to predict the geomechanical behavior of the rock slopes constrained by earthworks.

Keywords: Slope Stability, Logistic Regression, Rock Mass, Explosive

Procedia PDF Downloads 176
16 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach

Authors: Ekele Alih, Enejo Jalija

Abstract:

Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,

Keywords: Female genital mutilation, Logistic Regression, tau statistic, African society

Procedia PDF Downloads 114
15 Modelling the Impacts of Geophysical Parameters on Deforestation and Forest Degradation in Pre and Post Ban Logging Periods in Hindu Kush Himalayas

Authors: MUHAMMAD QASIM, Alam Zeb, Glen W. Armstrong

Abstract:

Loss of forest cover is one of the most important land cover changes and has been of great concern to policy makers. This study quantified forest cover changes over pre logging ban (1973-1993) and post logging ban (1993-2015) to examine the role of geophysical factors and spatial attributes of land in the two periods. We show that despite a complete ban on green felling, forest cover decreased by 28% and mostly converted to rangeland. Nevertheless, the logging ban was completely effective in controlling agriculture expansion. The binary logistic regression revealed that the south facing aspects at low elevation witnessed more deforestation in the pre-ban period compared to post-ban. Opposite to deforestation, forest degradation was more prominent on the northern aspects at higher elevation during the policy period. Agriculture expansion was widespread in the low elevation flat areas with gentle slope, while during the policy period agriculture contraction in the form of regeneration was observed on the low elevation areas of north facing slopes. All proximity variables, except distance to administrative boundary, showed a similar trend across the two periods and were important explanatory variables in understanding forest and agriculture expansion. The changes in determinants of forest and agriculture expansion and contraction over the two periods might be attributed to the influence of policy and a general decrease in resource availability.

Keywords: Deforestation, Pakistan, Forest conservation, Logistic Regression, Forest Degradation, wood harvesting ban, agriculture expansion, Chitral

Procedia PDF Downloads 104
14 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating

Authors: P. S. Jagadeesh Kumar, Tracy Lin Huan, Mingmin Pan, Xianpei Li, Yanmin Yuan

Abstract:

Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.

Keywords: alzheimer’s disease, Logistic Regression, linear bootstrap aggregating, reminiscence therapy

Procedia PDF Downloads 109
13 Modelling the Impact of Installation of Heat Cost Allocators in District Heating Systems Using Machine Learning

Authors: Danica Maljkovic, Igor Balen, Bojana Dalbelo Basic

Abstract:

Following the regulation of EU Directive on Energy Efficiency, specifically Article 9, individual metering in district heating systems has to be introduced by the end of 2016. These directions have been implemented in member state’s legal framework, Croatia is one of these states. The directive allows installation of both heat metering devices and heat cost allocators. Mainly due to bad communication and PR, the general public false image was created that the heat cost allocators are devices that save energy. Although this notion is wrong, the aim of this work is to develop a model that would precisely express the influence of installation heat cost allocators on potential energy savings in each unit within multifamily buildings. At the same time, in recent years, a science of machine learning has gain larger application in various fields, as it is proven to give good results in cases where large amounts of data are to be processed with an aim to recognize a pattern and correlation of each of the relevant parameter as well as in the cases where the problem is too complex for a human intelligence to solve. A special method of machine learning, decision tree method, has proven an accuracy of over 92% in prediction general building consumption. In this paper, a machine learning algorithms will be used to isolate the sole impact of installation of heat cost allocators on a single building in multifamily houses connected to district heating systems. Special emphasises will be given regression analysis, logistic regression, support vector machines, decision trees and random forest method.

Keywords: Machine Learning, Energy Efficiency, Support Vector Machines, Regression analysis, Logistic Regression, district heating, heat cost allocator, decision tree model, decision trees and random forest method

Procedia PDF Downloads 83
12 Qualitative and Quantitative Analysis of Motivation Letters to Model Turnover in Non-Governmental Organization

Authors: A. Porshnev, A. Zaporozhtchuk

Abstract:

Motivation regarded as a key factor of labor turnover, is especially important for volunteers working on an altruistic basis in NGO. Despite the motivational letter, candidate selection depends on the impression of the selection committee, which can be subject to human bias. We expect that structured and unstructured information provided in motivation letters could be used to improve candidate selection procedures. In our paper, we perform qualitative and quantitative analysis of 2280 motivation letters, create logistic regression, and build a decision tree to improve selection procedures. Our analysis showed that motivation factors are significant and enable human resources department to forecast labor turnover and provide extra information to demographic, professional and timing questions. In spite of the average level of accuracy the model demonstrates the selection procedures of company of under consideration can be improved. We also discuss interrelation between answers to open and closed motivation questions, recommend changes in motivational letter templates to ensure more relevant information about applicants and further steps to create more accurate model.

Keywords: model, Decision trees, Logistic Regression, Retention, turnover, non-governmental organization, motivational letter

Procedia PDF Downloads 61
11 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms

Authors: Senol Dogan, Gunay Karli

Abstract:

Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.

Keywords: Algorithms, Machine Learning, Bariatric surgery, Logistic Regression, tree, boosted, prediction of publications, comparison of algorithms, ANN model

Procedia PDF Downloads 85
10 Examining the Predictors of Non-Urgent Emergency Department Visits: A Population Based Study

Authors: Maher El-Masri, Jamie Crawley, Judy Bornais, Abeer Omar

Abstract:

Background: Misuse of Emergency Department (ED) for non-urgent healthcare results in unnecessary crowdedness that can result in long ED waits and delays in treatment, diversion of ambulances to other hospitals, poor health outcomes for patients, and increased risk of death Objectives: The main purpose of this study was to explore the independent predictors of non-urgent ED visits in Erie St. Clair LHIN. Secondary purposes of the study include comparison of the rates of non-urgent ED visits between urban and rural hospitals Design: A secondary analysis of archived population-based data on 597,373 ED visits in southwestern Ontario Results The results suggest that older (OR = .992; 95% CI .992 – .993) and female patients (OR = .940; 95% CI .929 - .950) were less likely to visit ED for non-urgent causes. Non-urgent ED visits during the winter, spring, and fall were 13%, 5.8%, and 7.5%, respectively, lesser than they were during the summer time. The data further suggest that non-urgent visits were 19.6% and 21.3% less likely to occur in evening and overnight shifts compared to the day shift. Non-urgent visits were 2.76 times more likely to present to small community hospitals than large community hospitals. Health care providers were 1.92 times more likely to refer patients with non-urgent health problem to the ED than the decision taken by patients, family member or caretakers. Conclusion: In conclusion, our study highlights a number of important factors that are associated with inappropriate use of ED visits for non-urgent health problems. Knowledge of these factors could be used to address the issue of unnecessary ED crowdedness.

Keywords: Emergency Department, Logistic Regression, predictors, non-urgent visits

Procedia PDF Downloads 107
9 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Neural Networks, Machine Learning, Cluster Analysis, Decision trees, Logistic Regression, Factor Analysis, stock market trading, automated stock investment system

Procedia PDF Downloads 54
8 The Effect of Sustainable Land Management Technologies on Food Security of Farming Households in Kwara State, Nigeria

Authors: Shehu A. Salau, Robiu O. Aliu, Nofiu B. Nofiu

Abstract:

Nigeria is among countries of the world confronted with food insecurity problem. The agricultural production systems that produces food for the teaming population is not endurable. Attention is thus being given to alternative approaches of intensification such as the use of Sustainable Land Management (SLM) technologies. Thus, this study assessed the effect of SLM technologies on food security of farming households in Kwara State, Nigeria. A-three stage sampling technique was used to select a sample of 200 farming households for this study. Descriptive statistics, Shriar index, Likert scale, food security index and logistic regression were employed for the analysis. The result indicated that majority (41%) of the household heads were between the ages of 51 and 70 years with an average of 60.5 years. Food security index revealed that 35% and 65% of the households were food secure and food insecure respectively. The logistic regression showed that SLM technologies, estimated income, household size, gender and age of the household heads were the critical determinants of food security among farming households. The most effective coping strategies adopted by households geared towards lessening the effects of food insecurity are reduced quality of food consumed, employed off-farm jobs to raise household income and diversion of money budgeted for other uses to purchase foods. Governments should encourage the adoption and use of SLM technologies at all levels. Policies and strategies that reduce household size should be enthusiastically pursued to reduce food insecurity.

Keywords: Food Security, Logistic Regression, Agricultural Practices, coping strategies, farming households, SLM technologies

Procedia PDF Downloads 15
7 Gender Estimation by Means of Quantitative Measurements of Foramen Magnum: An Analysis of CT Head Images

Authors: W. M. Ediri Arachchi, Thilini Hathurusinghe, Uthpalie Siriwardhana, Ranga Thudugala, Indeewari Herath, Gayani Senanayake

Abstract:

The foramen magnum is more prone to protect than other skeletal remains during high impact and severe disruptive injuries. Therefore, it is worthwhile to explore whether these measurements can be used to determine the human gender which is vital in forensic and anthropological studies. The idea was to find out the ability to use quantitative measurements of foramen magnum as an anatomical indicator for human gender estimation and to evaluate the gender-dependent variations of foramen magnum using quantitative measurements. Randomly selected 113 subjects who underwent CT head scans at Sri Jayawardhanapura General Hospital of Sri Lanka within a period of six months, were included in the study. The sample contained 58 males (48.76 ± 14.7 years old) and 55 females (47.04 ±15.9 years old). Maximum length of the foramen magnum (LFM), maximum width of the foramen magnum (WFM), minimum distance between occipital condyles (MnD) and maximum interior distance between occipital condyles (MxID) were measured. Further, AreaT and AreaR were also calculated. The gender was estimated using binomial logistic regression. The mean values of all explanatory variables (LFM, WFM, MnD, MxID, AreaT, and AreaR) were greater among male than female. All explanatory variables except MnD (p=0.669) were statistically significant (p < 0.05). Significant bivariate correlations were demonstrated by AreaT and AreaR with the explanatory variables. The results evidenced that WFM and MxID were the best measurements in predicting gender according to binomial logistic regression. The estimated model was: log (p/1-p) =10.391-0.136×MxID-0.231×WFM, where p is the probability of being a female. The classification accuracy given by the above model was 65.5%. The quantitative measurements of foramen magnum can be used as a reliable anatomical marker for human gender estimation in the Sri Lankan context.

Keywords: Logistic Regression, foramen magnum, forensic and anthropological studies, gender estimation

Procedia PDF Downloads 15
6 Assessment of Pastoralist-Crop Farmers Conflict and Food Security of Farming Households in Kwara State, Nigeria

Authors: S. A. Salau, I. F. Ayanda, I. Afe, M. O. Adesina, N. B. Nofiu

Abstract:

Food insecurity is still a critical challenge among rural and urban households in Nigeria. The country’s food insecurity situation became more pronounced due to frequent conflict between pastoralist and crop farmers. Thus, this study assesses pastoralist-crop farmers’ conflict and food security of farming households in Kwara state, Nigeria. The specific objectives are to measure the food security status of the respondents, quantify pastoralist- crop farmers’ conflict, determine the effect of pastoralist- crop farmers conflict on food security and describe the effective coping strategies adopted by the respondents to reduce the effect of food insecurity. A combination of purposive and simple random sampling techniques will be used to select 250 farming households for the study. The analytical tools include descriptive statistics, Likert-scale, logistic regression, and food security index. Using the food security index approach, the percentage of households that were food secure and insecure will be known. Pastoralist- crop farmers’ conflict will be measured empirically by quantifying loses due to the conflict. The logistic regression will indicate if pastoralist- crop farmers’ conflict is a critical determinant of food security among farming households in the study area. The coping strategies employed by the respondents in cushioning the effects of food insecurity will also be revealed. Empirical studies on the effect of pastoralist- crop farmers’ conflict on food security are rare in the literature. This study will quantify conflict and reveal the direction as well as the extent of the relationship between conflict and food security. It could contribute to the identification and formulation of strategies for the minimization of conflict among pastoralist and crop farmers in an attempt to reduce food insecurity. Moreover, this study could serve as valuable reference material for future researches and open up new areas for further researches.

Keywords: Food Security, Agriculture, Conflict, Logistic Regression, coping strategies

Procedia PDF Downloads 17
5 Exploring the Factors Affecting the Presence of Farmers’ Markets in Rural British Columbia

Authors: Amirmohsen Behjat, Aleck Ostry, Christina Miewald, Bernie Pauly

Abstract:

Farmers’ Markets have become one of the important healthy food suppliers in both rural communities and urban settings. Farmers’ markets are evolving and their number has rapidly increased in the past decade. Despite this drastic increase, the distribution of the farmers’ markets is not even across different areas. The main goal of this study is to explore the socioeconomic, geographic, and demographic variables which affect the establishment of farmers’ market in rural communities in British Columbia (BC). Thus, the data on available farmers’ markets in rural areas were collected from BC Association of Farmers’ Markets and spatially joined to BC map at Dissemination Area (DA) level using ArcGIS software to link the farmers’ market to the respective communities that they serve. Then, in order to investigate this issue and understand which rural communities farmer’ markets tend to operate, a binary logistic regression analysis was performed with the availability of farmer’ markets at DA-level as dependent variable and Deprivation Index (DI), Metro Influence Zone (MIZ) and population as independent variables. The results indicated that DI and MIZ variables are not statistically significant whereas the population is the only which had a significant contribution in predicting the availability of farmers’ markets in rural BC. Moreover, this study found that farmers’ markets usually do not operate in rural food deserts where other healthy food providers such as supermarkets and grocery stores are non-existent. In conclusion, the presence of farmers markets is not associated with socioeconomic and geographic characteristics of rural communities in BC, but farmers’ markets tend to operate in more populated rural communities in BC.

Keywords: Logistic Regression, ArcGIS, farmers’ markets, socioeconomic and demographic variables, metro influence zone

Procedia PDF Downloads 17
4 Myers-Briggs Type Index Personality Type Classification Based on an Individual’s Spotify Playlists

Authors: Ibrahim Demir, Sefik Can Karakaya

Abstract:

In this study, the relationship between musical preferences and personality traits has been investigated in terms of Spotify audio analysis features. The aim of this paper is to build such a classifier capable of segmenting people into their Myers-Briggs Type Index (MBTI) personality type based on their Spotify playlists. Music takes an important place in the lives of people all over the world and online music streaming platforms make it easier to reach musical contents. In this context, the motivation to build such a classifier is allowing people to gain access to their MBTI personality type and perhaps for more reliably and more quickly. For this purpose, logistic regression and deep neural networks have been selected for classifier and their performances are compared. In conclusion, it has been found that musical preferences differ statistically between personality traits, and evaluated models are able to distinguish personality types based on given musical data structure with over %60 accuracy rate.

Keywords: music Psychology, Logistic Regression, Deep Neural Networks, Myers-Briggs Type indicator, Spotify, behavioural user profiling

Procedia PDF Downloads 1