Search results for: jackknife
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7

Search results for: jackknife

7 Robust Shrinkage Principal Component Parameter Estimator for Combating Multicollinearity and Outliers’ Problems in a Poisson Regression Model

Authors: Arum Kingsley Chinedu, Ugwuowo Fidelis Ifeanyi, Oranye Henrietta Ebele

Abstract:

The Poisson regression model (PRM) is a nonlinear model that belongs to the exponential family of distribution. PRM is suitable for studying count variables using appropriate covariates and sometimes experiences the problem of multicollinearity in the explanatory variables and outliers on the response variable. This study aims to address the problem of multicollinearity and outliers jointly in a Poisson regression model. We developed an estimator called the robust modified jackknife PCKL parameter estimator by combining the principal component estimator, modified jackknife KL and transformed M-estimator estimator to address both problems in a PRM. The superiority conditions for this estimator were established, and the properties of the estimator were also derived. The estimator inherits the characteristics of the combined estimators, thereby making it efficient in addressing both problems. And will also be of immediate interest to the research community and advance this study in terms of novelty compared to other studies undertaken in this area. The performance of the estimator (robust modified jackknife PCKL) with other existing estimators was compared using mean squared error (MSE) as a performance evaluation criterion through a Monte Carlo simulation study and the use of real-life data. The results of the analytical study show that the estimator outperformed other existing estimators compared with by having the smallest MSE across all sample sizes, different levels of correlation, percentages of outliers and different numbers of explanatory variables.

Keywords: jackknife modified KL, outliers, multicollinearity, principal component, transformed M-estimator.

Procedia PDF Downloads 18
6 Cross-Validation of the Data Obtained for ω-6 Linoleic and ω-3 α-Linolenic Acids Concentration of Hemp Oil Using Jackknife and Bootstrap Resampling

Authors: Vibha Devi, Shabina Khanam

Abstract:

Hemp (Cannabis sativa) possesses a rich content of ω-6 linoleic and ω-3 linolenic essential fatty acid in the ratio of 3:1, which is a rare and most desired ratio that enhances the quality of hemp oil. These components are beneficial for the development of cell and body growth, strengthen the immune system, possess anti-inflammatory action, lowering the risk of heart problem owing to its anti-clotting property and a remedy for arthritis and various disorders. The present study employs supercritical fluid extraction (SFE) approach on hemp seed at various conditions of parameters; temperature (40 - 80) °C, pressure (200 - 350) bar, flow rate (5 - 15) g/min, particle size (0.430 - 1.015) mm and amount of co-solvent (0 - 10) % of solvent flow rate through central composite design (CCD). CCD suggested 32 sets of experiments, which was carried out. As SFE process includes large number of variables, the present study recommends the application of resampling techniques for cross-validation of the obtained data. Cross-validation refits the model on each data to achieve the information regarding the error, variability, deviation etc. Bootstrap and jackknife are the most popular resampling techniques, which create a large number of data through resampling from the original dataset and analyze these data to check the validity of the obtained data. Jackknife resampling is based on the eliminating one observation from the original sample of size N without replacement. For jackknife resampling, the sample size is 31 (eliminating one observation), which is repeated by 32 times. Bootstrap is the frequently used statistical approach for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. For bootstrap resampling, the sample size is 32, which was repeated by 100 times. Estimands for these resampling techniques are considered as mean, standard deviation, variation coefficient and standard error of the mean. For ω-6 linoleic acid concentration, mean value was approx. 58.5 for both resampling methods, which is the average (central value) of the sample mean of all data points. Similarly, for ω-3 linoleic acid concentration, mean was observed as 22.5 through both resampling. Variance exhibits the spread out of the data from its mean. Greater value of variance exhibits the large range of output data, which is 18 for ω-6 linoleic acid (ranging from 48.85 to 63.66 %) and 6 for ω-3 linoleic acid (ranging from 16.71 to 26.2 %). Further, low value of standard deviation (approx. 1 %), low standard error of the mean (< 0.8) and low variance coefficient (< 0.2) reflect the accuracy of the sample for prediction. All the estimator value of variance coefficients, standard deviation and standard error of the mean are found within the 95 % of confidence interval.

Keywords: resampling, supercritical fluid extraction, hemp oil, cross-validation

Procedia PDF Downloads 119
5 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 352
4 Invasive Ranges of Gorse (Ulex europaeus) in South Australia and Sri Lanka Using Species Distribution Modelling

Authors: Champika S. Kariyawasam

Abstract:

The distribution of gorse (Ulex europaeus) plants in South Australia has been modelled using 126 presence-only location data as a function of seven climate parameters. The predicted range of U. europaeus is mainly along the Mount Lofty Ranges in the Adelaide Hills and on Kangaroo Island. Annual precipitation and yearly average aridity index appeared to be the highest contributing variables to the final model formulation. The Jackknife procedure was employed to identify the contribution of different variables to gorse model outputs and response curves were used to predict changes with changing environmental variables. Based on this analysis, it was revealed that the combined effect of one or more variables could make a completely different impact to the original variables on their own to the model prediction. This work also demonstrates the need for a careful approach when selecting environmental variables for projecting correlative models to climatically distinct area. Maxent acts as a robust model when projecting the fitted species distribution model to another area with changing climatic conditions, whereas the generalized linear model, bioclim, and domain models to be less robust in this regard. These findings are important not only for predicting and managing invasive alien gorse in South Australia and Sri Lanka but also in other countries of the invasive range.

Keywords: invasive species, Maxent, species distribution modelling, Ulex europaeus

Procedia PDF Downloads 102
3 Efficacy of Conservation Strategies for Endangered Garcinia gummi gutta under Climate Change in Western Ghats

Authors: Malay K. Pramanik

Abstract:

Climate change is continuously affecting the ecosystem, species distribution as well as global biodiversity. The assessment of the species potential distribution and the spatial changes under various climate change scenarios is a significant step towards the conservation and mitigation of habitat shifts, and species' loss and vulnerability. In this context, the present study aimed to predict the influence of current and future climate on an ecologically vulnerable medicinal species, Garcinia gummi-gutta, of the southern Western Ghats using Maximum Entropy (MaxEnt) modeling. The future projections were made for the period of 2050 and 2070 with RCP (Representative Concentration Pathways) scenario of 4.5 and 8.5 using 84 species occurrence data, and climatic variables from three different models of Intergovernmental Panel for Climate Change (IPCC) fifth assessment. Climatic variables contributions were assessed using jackknife test and AOC value 0.888 indicates the model perform with high accuracy. The major influencing variables will be annual precipitation, precipitation of coldest quarter, precipitation seasonality, and precipitation of driest quarter. The model result shows that the current high potential distribution of the species is around 1.90% of the study area, 7.78% is good potential; about 90.32% is moderate to very low potential for species suitability. Finally, the results of all model represented that there will be a drastic decline in the suitable habitat distribution by 2050 and 2070 for all the RCP scenarios. The study signifies that MaxEnt model might be an efficient tool for ecosystem management, biodiversity protection, and species re-habitation planning under climate change.

Keywords: Garcinia gummi gutta, maximum entropy modeling, medicinal plants, climate change, western ghats, MaxEnt

Procedia PDF Downloads 351
2 Bioclimatic Niches of Endangered Garcinia indica Species on the Western Ghats: Predicting Habitat Suitability under Current and Future Climate

Authors: Malay K. Pramanik

Abstract:

In recent years, climate change has become a major threat and has been widely documented in the geographic distribution of many plant species. However, the impacts of climate change on the distribution of ecologically vulnerable medicinal species remain largely unknown. The identification of a suitable habitat for a species under climate change scenario is a significant step towards the mitigation of biodiversity decline. The study, therefore, aims to predict the impact of current, and future climatic scenarios on the distribution of the threatened Garcinia indica across the northern Western Ghats using Maximum Entropy (MaxEnt) modelling. The future projections were made for the year 2050 and 2070 with all Representative Concentration Pathways (RCPs) scenario (2.6, 4.5, 6.0, and 8.5) using 56 species occurrence data, and 19 bioclimatic predictors from the BCC-CSM1.1 model of the Intergovernmental Panel for Climate Change’s (IPCC) 5th assessment. The bioclimatic variables were minimised to a smaller number of variables after a multicollinearity test, and their contributions were assessed using jackknife test. The AUC value of 0.956 ± 0.023 indicates that the model performs with excellent accuracy. The study identified that temperature seasonality (39.5 ± 3.1%), isothermality (19.2 ± 1.6%), and annual precipitation (12.7 ± 1.7%) would be the major influencing variables in the current and future distribution. The model predicted 10.5% (19318.7 sq. km) of the study area as moderately to very highly suitable, while 82.60% (151904 sq. km) of the study area was identified as ‘unsuitable’ or ‘very low suitable’. Our predictions of climate change impact on habitat suitability suggest that there will be a drastic reduction in the suitability by 5.29% and 5.69% under RCP 8.5 for 2050 and 2070, respectively. Finally, the results signify that the model might be an effective tool for biodiversity protection, ecosystem management, and species re-habitation planning under future climate change scenarios.

Keywords: Garcinia Indica, maximum entropy modelling, climate change, MaxEnt, Western Ghats, medicinal plants

Procedia PDF Downloads 128
1 Association of Temperature Factors with Seropositive Results against Selected Pathogens in Dairy Cow Herds from Central and Northern Greece

Authors: Marina Sofia, Alexios Giannakopoulos, Antonia Touloudi, Dimitris C Chatzopoulos, Zoi Athanasakopoulou, Vassiliki Spyrou, Charalambos Billinis

Abstract:

Fertility of dairy cattle can be affected by heat stress when the ambient temperature increases above 30°C and the relative humidity ranges from 35% to 50%. The present study was conducted on dairy cattle farms during summer months in Greece and aimed to identify the serological profile against pathogens that could affect fertility and to associate the positive serological results at herd level with temperature factors. A total of 323 serum samples were collected from clinically healthy dairy cows of 8 herds, located in Central and Northern Greece. ELISA tests were performed to detect antibodies against selected pathogens that affect fertility, namely Chlamydophila abortus, Coxiella burnetii, Neospora caninum, Toxoplasma gondii and Infectious Bovine Rhinotracheitis Virus (IBRV). Eleven climatic variables were derived from the WorldClim version 1.4. and ArcGIS V.10.1 software was used for analysis of the spatial information. Five different MaxEnt models were applied to associate the temperature variables with the locations of seropositive Chl. abortus, C. burnetii, N. caninum, T. gondii and IBRV herds (one for each pathogen). The logistic outputs were used for the interpretation of the results. ROC analyses were performed to evaluate the goodness of fit of the models’ predictions. Jackknife tests were used to identify the variables with a substantial contribution to each model. The seropositivity rates of pathogens varied among the 8 herds (0.85-4.76% for Chl. abortus, 4.76-62.71% for N. caninum, 3.8-43.47% for C. burnetii, 4.76-39.28% for T. gondii and 47.83-78.57% for IBRV). The variables of annual temperature range, mean diurnal range and maximum temperature of the warmest month gave a contribution to all five models. The regularized training gains, the training AUCs and the unregularized training gains were estimated. The mean diurnal range gave the highest gain when used in isolation and decreased the gain the most when it was omitted in the two models for seropositive Chl.abortus and IBRV herds. The annual temperature range increased the gain when used alone and decreased the gain the most when it was omitted in the models for seropositive C. burnetii, N. caninum and T. gondii herds. In conclusion, antibodies against Chl. abortus, C. burnetii, N. caninum, T. gondii and IBRV were detected in most herds suggesting circulation of pathogens that could cause infertility. The results of the spatial analyses demonstrated that the annual temperature range, mean diurnal range and maximum temperature of the warmest month could affect positively the possible pathogens’ presence. Acknowledgment: This research has been co‐financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH–CREATE–INNOVATE (project code: T1EDK-01078).

Keywords: dairy cows, seropositivity, spatial analysis, temperature factors

Procedia PDF Downloads 167