Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3279

Search results for: random sampling

3279 Estimation of Population Mean under Random Non-Response in Two-Phase Successive Sampling

Authors: M. Khalid, G. N. Singh


In this paper, we have considered the problem of estimation for population mean, on current (second) occasion in the presence of random non response in two-occasion successive sampling under two phase set-up. Modified exponential type estimators have been proposed, and their properties are studied under the assumptions that numbers of sampling units follow a distribution due to random non response situations. The performances of the proposed estimators are compared with linear combinations of two estimators, (a) sample mean estimator for fresh sample and (b) ratio estimator for matched sample under the complete response situations. Results are demonstrated through empirical studies which present the effectiveness of the proposed estimators. Suitable recommendations have been made to the survey practitioners.

Keywords: successive sampling, random non-response, auxiliary variable, bias, mean square error

Procedia PDF Downloads 332
3278 Spatially Random Sampling for Retail Food Risk Factors Study

Authors: Guilan Huang


In 2013 and 2014, the U.S. Food and Drug Administration (FDA) collected data from selected fast food restaurants and full service restaurants for tracking changes in the occurrence of foodborne illness risk factors. This paper discussed how we customized spatial random sampling method by considering financial position and availability of FDA resources, and how we enriched restaurants data with location. Location information of restaurants provides opportunity for quantitatively determining random sampling within non-government units (e.g.: 240 kilometers around each data-collector). Spatial analysis also could optimize data-collectors’ work plans and resource allocation. Spatial analytic and processing platform helped us handling the spatial random sampling challenges. Our method fits in FDA’s ability to pinpoint features of foodservice establishments, and reduced both time and expense on data collection.

Keywords: geospatial technology, restaurant, retail food risk factor study, spatially random sampling

Procedia PDF Downloads 276
3277 Estimation of Population Mean under Random Non-Response in Two-Occasion Successive Sampling

Authors: M. Khalid, G. N. Singh


In this paper, we have considered the problems of estimation for the population mean on current (second) occasion in two-occasion successive sampling under random non-response situations. Some modified exponential type estimators have been proposed and their properties are studied under the assumptions that the number of sampling unit follows a discrete distribution due to random non-response situations. The performances of the proposed estimators are compared with linear combinations of two estimators, (a) sample mean estimator for fresh sample and (b) ratio estimator for matched sample under the complete response situations. Results are demonstrated through empirical studies which present the effectiveness of the proposed estimators. Suitable recommendations have been made to the survey practitioners.

Keywords: modified exponential estimator, successive sampling, random non-response, auxiliary variable, bias, mean square error

Procedia PDF Downloads 267
3276 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell


Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 404
3275 Investigating the Efficiency of Stratified Double Median Ranked Set Sample for Estimating the Population Mean

Authors: Mahmoud I. Syam


Stratified double median ranked set sampling (SDMRSS) method is suggested for estimating the population mean. The SDMRSS is compared with the simple random sampling (SRS), stratified simple random sampling (SSRS), and stratified ranked set sampling (SRSS). It is shown that SDMRSS estimator is an unbiased of the population mean and more efficient than SRS, SSRS, and SRSS. Also, by SDMRSS, we can increase the efficiency of mean estimator for specific value of the sample size. SDMRSS is applied on real life examples, and the results of the example agreed the theoretical results.

Keywords: efficiency, double ranked set sampling, median ranked set sampling, ranked set sampling, stratified

Procedia PDF Downloads 154
3274 Different Sampling Schemes for Semi-Parametric Frailty Model

Authors: Nursel Koyuncu, Nihal Ata Tutkun


Frailty model is a survival model that takes into account the unobserved heterogeneity for exploring the relationship between the survival of an individual and several covariates. In the recent years, proposed survival models become more complex and this feature causes convergence problems especially in large data sets. Therefore selection of sample from these big data sets is very important for estimation of parameters. In sampling literature, some authors have defined new sampling schemes to predict the parameters correctly. For this aim, we try to see the effect of sampling design in semi-parametric frailty model. We conducted a simulation study in R programme to estimate the parameters of semi-parametric frailty model for different sample sizes, censoring rates under classical simple random sampling and ranked set sampling schemes. In the simulation study, we used data set recording 17260 male Civil Servants aged 40–64 years with complete 10-year follow-up as population. Time to death from coronary heart disease is treated as a survival-time and age, systolic blood pressure are used as covariates. We select the 1000 samples from population using different sampling schemes and estimate the parameters. From the simulation study, we concluded that ranked set sampling design performs better than simple random sampling for each scenario.

Keywords: frailty model, ranked set sampling, efficiency, simple random sampling

Procedia PDF Downloads 122
3273 Bayesian Approach for Moving Extremes Ranked Set Sampling

Authors: Said Ali Al-Hadhrami, Amer Ibrahim Al-Omari


In this paper, Bayesian estimation for the mean of exponential distribution is considered using Moving Extremes Ranked Set Sampling (MERSS). Three priors are used; Jeffery, conjugate and constant using MERSS and Simple Random Sampling (SRS). Some properties of the proposed estimators are investigated. It is found that the suggested estimators using MERSS are more efficient than its counterparts based on SRS.

Keywords: Bayesian, efficiency, moving extreme ranked set sampling, ranked set sampling

Procedia PDF Downloads 408
3272 Estimating The Population Mean by Using Stratified Double Extreme Ranked Set Sample

Authors: Mahmoud I. Syam, Kamarulzaman Ibrahim, Amer I. Al-Omari


Stratified double extreme ranked set sampling (SDERSS) method is introduced and considered for estimating the population mean. The SDERSS is compared with the simple random sampling (SRS), stratified ranked set sampling (SRSS) and stratified simple set sampling (SSRS). It is shown that the SDERSS estimator is an unbiased of the population mean and more efficient than the estimators using SRS, SRSS and SSRS when the underlying distribution of the variable of interest is symmetric or asymmetric.

Keywords: double extreme ranked set sampling, extreme ranked set sampling, ranked set sampling, stratified double extreme ranked set sampling

Procedia PDF Downloads 364
3271 Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

Authors: Chaghoub Soraya, Zhang Xiaoyan


This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

Keywords: approximation algorithms, buy-at-bulk, combinatorial optimization, network design, p-median

Procedia PDF Downloads 59
3270 Some Generalized Multivariate Estimators for Population Mean under Multi Phase Stratified Systematic Sampling

Authors: Muqaddas Javed, Muhammad Hanif


The generalized multivariate ratio and regression type estimators for population mean are suggested under multi-phase stratified systematic sampling (MPSSS) using multi auxiliary information. Estimators are developed under the two different situations of availability of auxiliary information. The expressions of bias and mean square error (MSE) are developed. Special cases of suggested estimators are also discussed and simulation study is conducted to observe the performance of estimators.

Keywords: generalized estimators, multi-phase sampling, stratified random sampling, systematic sampling

Procedia PDF Downloads 592
3269 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo


Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 47
3268 A Comparative Study on Sampling Techniques of Polynomial Regression Model Based Stochastic Free Vibration of Composite Plates

Authors: S. Dey, T. Mukhopadhyay, S. Adhikari


This paper presents an exhaustive comparative investigation on sampling techniques of polynomial regression model based stochastic natural frequency of composite plates. Both individual and combined variations of input parameters are considered to map the computational time and accuracy of each modelling techniques. The finite element formulation of composites is capable to deal with both correlated and uncorrelated random input variables such as fibre parameters and material properties. The results obtained by Polynomial regression (PR) using different sampling techniques are compared. Depending on the suitability of sampling techniques such as 2k Factorial designs, Central composite design, A-Optimal design, I-Optimal, D-Optimal, Taguchi’s orthogonal array design, Box-Behnken design, Latin hypercube sampling, sobol sequence are illustrated. Statistical analysis of the first three natural frequencies is presented to compare the results and its performance.

Keywords: composite plate, natural frequency, polynomial regression model, sampling technique, uncertainty quantification

Procedia PDF Downloads 422
3267 Stochastic Simulation of Random Numbers Using Linear Congruential Method

Authors: Melvin Ballera, Aldrich Olivar, Mary Soriano


Digital computers nowadays must be able to have a utility that is capable of generating random numbers. Usually, computer-generated random numbers are not random given predefined values such as starting point and end points, making the sequence almost predictable. There are many applications of random numbers such business simulation, manufacturing, services domain, entertainment sector and other equally areas making worthwhile to design a unique method and to allow unpredictable random numbers. Applying stochastic simulation using linear congruential algorithm, it shows that as it increases the numbers of the seed and range the number randomly produced or selected by the computer becomes unique. If this implemented in an environment where random numbers are very much needed, the reliability of the random number is guaranteed.

Keywords: stochastic simulation, random numbers, linear congruential algorithm, pseudorandomness

Procedia PDF Downloads 223
3266 Empirical Study of Running Correlations in Exam Marks: Same Statistical Pattern as Chance

Authors: Weisi Guo


It is well established that there may be running correlations in sequential exam marks due to students sitting in the order of course registration patterns. As such, a random and non-sequential sampling of exam marks is a standard recommended practice. Here, the paper examines a large number of exam data stretching several years across different modules to see the degree to which it is true. Using the real mark distribution as a generative process, it was found that random simulated data had no more sequential randomness than the real data. That is to say, the running correlations that one often observes are statistically identical to chance. Digging deeper, it was found that some high running correlations have students that indeed share a common course history and make similar mistakes. However, at the statistical scale of a module question, the combined effect is statistically similar to the random shuffling of papers. As such, there may not be the need to take random samples for marks, but it still remains good practice to mark papers in a random sequence to reduce the repetitive marking bias and errors.

Keywords: data analysis, empirical study, exams, marking

Procedia PDF Downloads 78
3265 Existence Result of Third Order Functional Random Integro-Differential Inclusion

Authors: D. S. Palimkar


The FRIGDI (functional random integrodifferential inclusion) seems to be new and includes several known random differential inclusions already studied in the literature as special cases have been discussed in the literature for various aspects of the solutions. In this paper, we prove the existence result for FIGDI under the non-convex case of multi-valued function involved in it.Using random fixed point theorem of B. C. Dhage and caratheodory condition. This result is new to the theory of differential inclusion.

Keywords: caratheodory condition, random differential inclusion, random solution, integro-differential inclusion

Procedia PDF Downloads 334
3264 Rural Development through Women Participation in Livestock Care and Management in District Faisalabad

Authors: Arfan Riasat, M. Iqbal Zafar, Gulfam Riasat


Pakistani women actively participate in livestock management activities, along with their normal domestic chores. The study was designed to measure the position and contribution of rural women, their constraints in livestock management activities and mainly how the rural women contribute for development in the district Faisalabad. It was envisioned that women participation in livestock activities have rarely been investigated. A multistage random sampling technique was used to collect the data from Tehsil Summandry of the district selected at random. Two union councils were taken by using simple random sampling technique. Four Chak (village) from each union council were selected at random and fifteen woman were further selected randomly from each selected chak. The results show that a vast majority of women were illiterate, having annual family income of one to two lac. They are living in joint family system. Their main occupation is agriculture and they spend long hours in whole livestock related activities to support their families. A large proportion of the respondents reported that they had to face problems and constraints in livestock activities in the context of decision making, medication, awareness, training along with social and economic issues. Analysis indicated that education level of women, income of household, age were significantly associated with level of participation. Women participation in livestock activities increased production and they were involved in income generating activities for better economic conditions of their families.

Keywords: women, participation, livestock, management, rural development

Procedia PDF Downloads 320
3263 Existence Theory for First Order Functional Random Differential Equations

Authors: Rajkumar N. Ingle


In this paper, the existence of a solution of nonlinear functional random differential equations of the first order is proved under caratheodory condition. The study of the functional random differential equation has got importance in the random analysis of the dynamical systems of universal phenomena. Objectives: Nonlinear functional random differential equation is useful to the scientists, engineers, and mathematicians, who are engaged in N.F.R.D.E. analyzing a universal random phenomenon, govern by nonlinear random initial value problems of D.E. Applications of this in the theory of diffusion or heat conduction. Methodology: Using the concepts of probability theory, functional analysis, generally the existence theorems for the nonlinear F.R.D.E. are prove by using some tools such as fixed point theorem. The significance of the study: Our contribution will be the generalization of some well-known results in the theory of Nonlinear F.R.D.E.s. Further, it seems that our study will be useful to scientist, engineers, economists and mathematicians in their endeavors to analyses the nonlinear random problems of the universe in a better way.

Keywords: Random Fixed Point Theorem, functional random differential equation, N.F.R.D.E., universal random phenomenon

Procedia PDF Downloads 385
3262 A Very Efficient Pseudo-Random Number Generator Based On Chaotic Maps and S-Box Tables

Authors: M. Hamdi, R. Rhouma, S. Belghith


Generating random numbers are mainly used to create secret keys or random sequences. It can be carried out by various techniques. In this paper we present a very simple and efficient pseudo-random number generator (PRNG) based on chaotic maps and S-Box tables. This technique adopted two main operations one to generate chaotic values using two logistic maps and the second to transform them into binary words using random S-Box tables. The simulation analysis indicates that our PRNG possessing excellent statistical and cryptographic properties.

Keywords: Random Numbers, Chaotic map, S-box, cryptography, statistical tests

Procedia PDF Downloads 276
3261 Optimal ECG Sampling Frequency for Multiscale Entropy-Based HRV

Authors: Manjit Singh


Multiscale entropy (MSE) is an extensively used index to provide a general understanding of multiple complexity of physiologic mechanism of heart rate variability (HRV) that operates on a wide range of time scales. Accurate selection of electrocardiogram (ECG) sampling frequency is an essential concern for clinically significant HRV quantification; high ECG sampling rate increase memory requirements and processing time, whereas low sampling rate degrade signal quality and results in clinically misinterpreted HRV. In this work, the impact of ECG sampling frequency on MSE based HRV have been quantified. MSE measures are found to be sensitive to ECG sampling frequency and effect of sampling frequency will be a function of time scale.

Keywords: ECG (electrocardiogram), heart rate variability (HRV), multiscale entropy, sampling frequency

Procedia PDF Downloads 176
3260 Influence of Random Fibre Packing on the Compressive Strength of Fibre Reinforced Plastic

Authors: Y. Wang, S. Zhang, X. Chen


The longitudinal compressive strength of fibre reinforced plastic (FRP) possess a large stochastic variability, which limits efficient application of composite structures. This study aims to address how the random fibre packing affects the uncertainty of FRP compressive strength. An novel approach is proposed to generate random fibre packing status by a combination of Latin hypercube sampling and random sequential expansion. 3D nonlinear finite element model is built which incorporates both the matrix plasticity and fibre geometrical instability. The matrix is modeled by isotropic ideal elasto-plastic solid elements, and the fibres are modeled by linear-elastic rebar elements. Composite with a series of different nominal fibre volume fractions are studied. Premature fibre waviness at different magnitude and direction is introduced in the finite element model. Compressive tests on uni-directional CFRP (carbon fibre reinforced plastic) are conducted following the ASTM D6641. By a comparison of 3D FE models and compressive tests, it is clearly shown that the stochastic variation of compressive strength is partly caused by the random fibre packing, and normal or lognormal distribution tends to be a good fit the probabilistic compressive strength. Furthermore, it is also observed that different random fibre packing could trigger two different fibre micro-buckling modes while subjected to longitudinal compression: out-of-plane buckling and twisted buckling. The out-of-plane buckling mode results much larger compressive strength, and this is the major reason why the random fibre packing results a large uncertainty in the FRP compressive strength. This study would contribute to new approaches to the quality control of FRP considering higher compressive strength or lower uncertainty.

Keywords: compressive strength, FRP, micro-buckling, random fibre packing

Procedia PDF Downloads 188
3259 Heuristic to Generate Random X-Monotone Polygons

Authors: Kamaljit Pati, Manas Kumar Mohanty, Sanjib Sadhu


A heuristic has been designed to generate a random simple monotone polygon from a given set of ‘n’ points lying on a 2-Dimensional plane. Our heuristic generates a random monotone polygon in O(n) time after O(nℓogn) preprocessing time which is improved over the previous work where a random monotone polygon is produced in the same O(n) time but the preprocessing time is O(k) for n < k < n2. However, our heuristic does not generate all possible random polygons with uniform probability. The space complexity of our proposed heuristic is O(n).

Keywords: sorting, monotone polygon, visibility, chain

Procedia PDF Downloads 353
3258 Efficient Alias-Free Level Crossing Sampling

Authors: Negar Riazifar, Nigel G. Stocks


This paper proposes strategies in level crossing (LC) sampling and reconstruction that provide alias-free high-fidelity signal reconstruction for speech signals without exponentially increasing sample number with increasing bit-depth. We introduce methods in LC sampling that reduce the sampling rate close to the Nyquist frequency even for large bit-depth. The results indicate that larger variation in the sampling intervals leads to an alias-free sampling scheme; this is achieved by either reducing the bit-depth or adding jitter to the system for high bit-depths. In conjunction with windowing, the signal is reconstructed from the LC samples using an efficient Toeplitz reconstruction algorithm.

Keywords: alias-free, level crossing sampling, spectrum, trigonometric polynomial

Procedia PDF Downloads 133
3257 A Comparative Study of Sampling-Based Uncertainty Propagation with First Order Error Analysis and Percentile-Based Optimization

Authors: M. Gulam Kibria, Shourav Ahmed, Kais Zaman


In system analysis, the information on the uncertain input variables cause uncertainty in the system responses. Different probabilistic approaches for uncertainty representation and propagation in such cases exist in the literature. Different uncertainty representation approaches result in different outputs. Some of the approaches might result in a better estimation of system response than the other approaches. The NASA Langley Multidisciplinary Uncertainty Quantification Challenge (MUQC) has posed challenges about uncertainty quantification. Subproblem A, the uncertainty characterization subproblem, of the challenge posed is addressed in this study. In this subproblem, the challenge is to gather knowledge about unknown model inputs which have inherent aleatory and epistemic uncertainties in them with responses (output) of the given computational model. We use two different methodologies to approach the problem. In the first methodology we use sampling-based uncertainty propagation with first order error analysis. In the other approach we place emphasis on the use of Percentile-Based Optimization (PBO). The NASA Langley MUQC’s subproblem A is developed in such a way that both aleatory and epistemic uncertainties need to be managed. The challenge problem classifies each uncertain parameter as belonging to one the following three types: (i) An aleatory uncertainty modeled as a random variable. It has a fixed functional form and known coefficients. This uncertainty cannot be reduced. (ii) An epistemic uncertainty modeled as a fixed but poorly known physical quantity that lies within a given interval. This uncertainty is reducible. (iii) A parameter might be aleatory but sufficient data might not be available to adequately model it as a single random variable. For example, the parameters of a normal variable, e.g., the mean and standard deviation, might not be precisely known but could be assumed to lie within some intervals. It results in a distributional p-box having the physical parameter with an aleatory uncertainty, but the parameters prescribing its mathematical model are subjected to epistemic uncertainties. Each of the parameters of the random variable is an unknown element of a known interval. This uncertainty is reducible. From the study, it is observed that due to practical limitations or computational expense, the sampling is not exhaustive in sampling-based methodology. That is why the sampling-based methodology has high probability of underestimating the output bounds. Therefore, an optimization-based strategy to convert uncertainty described by interval data into a probabilistic framework is necessary. This is achieved in this study by using PBO.

Keywords: aleatory uncertainty, epistemic uncertainty, first order error analysis, uncertainty quantification, percentile-based optimization

Procedia PDF Downloads 159
3256 The Staff Performance Efficiency of the Faculty of Management Science, Suan Sunandha Rajabhat University

Authors: Nipawan Tharasak, Ladda Hirunyava


The objective of the research was to study factors affecting working efficiency and the relationship between working environment, satisfaction to human resources management and operation employees’ working efficiency of Faculty of Management Science, Suan Sunandha Rajabhat University. The sample size of the research was based on 33 employees of Faculty of Management Science. The researcher had classified the support employees into 4 divisions by using Stratified Random Sampling. Individual sample was randomized by using Simple Random Sampling. Data was collected through the instrument. The Statistical Package for the Windows was utilized for data processing. Percentage, mean, standard deviation, the t-test, One-way ANOVA, and Pearson product moment correlation coefficient were applied. The result found the support employees’ satisfaction in human resources management of Faculty of Management Science in following areas: remuneration; employee recruitment & selection; manpower planning; performance evaluation; staff training & developing; and spirit & fairness were overall in good level.

Keywords: faculty of management science, operational factors, practice performance, staff working

Procedia PDF Downloads 167
3255 Design of Bayesian MDS Sampling Plan Based on the Process Capability Index

Authors: Davood Shishebori, Mohammad Saber Fallah Nezhad, Sina Seifi


In this paper, a variable multiple dependent state (MDS) sampling plan is developed based on the process capability index using Bayesian approach. The optimal parameters of the developed sampling plan with respect to constraints related to the risk of consumer and producer are presented. Two comparison studies have been done. First, the methods of double sampling model, sampling plan for resubmitted lots and repetitive group sampling (RGS) plan are elaborated and average sample numbers of the developed MDS plan and other classical methods are compared. A comparison study between the developed MDS plan based on Bayesian approach and the exact probability distribution is carried out.

Keywords: MDS sampling plan, RGS plan, sampling plan for resubmitted lots, process capability index (PCI), average sample number (ASN), Bayesian approach

Procedia PDF Downloads 190
3254 Dimension of Water Accessibility in the Southern Part of Niger State, Nigeria

Authors: Kudu Dangana, Pai H. Halilu, Osesienemo R. Asiribo-Sallau, Garba Inuwa Kuta


The study examined the determinants of household water accessibility in Southern part of Niger State, Nigeria. Data for the study was obtained from primary and secondary sources using questionnaire, interview, personal observation and documents. 1,192 questionnaires were administered; sampling techniques adopted are combination of purposive, stratified and simple random. Purposive sampling technique was used to determine sample frame; sample unit was determined using stratified sampling method and simple random technique was used in administering questionnaires. The result was analyzed within the scope of “WHO” water accessibility indicators using descriptive statistics. Major sources of water in the area are well; hand and electric pump borehole and streams. These sources account for over 90% of household’s water. Average per capita water consumption in the area is 22 liters per day, while location efficiency of facilities revealed an average of 80 people per borehole. Household water accessibility is affected mainly by the factors of distances, time spent to obtain water, low income status of the majority of respondents to access modern water infrastructure, and to a lesser extent household size. Recommendations includes, all tiers of government to intensify efforts in providing water infrastructures and existing ones through budgetary provisions, and communities should organize fund raising bazaar, so as to raise fund to improve water infrastructures in the area.

Keywords: accessibility, determined, stratified, scope

Procedia PDF Downloads 258
3253 Methods of Variance Estimation in Two-Phase Sampling

Authors: Raghunath Arnab


The two-phase sampling which is also known as double sampling was introduced in 1938. In two-phase sampling, samples are selected in phases. In the first phase, a relatively large sample of size is selected by some suitable sampling design and only information on the auxiliary variable is collected. During the second phase, a sample of size is selected either from, the sample selected in the first phase or from the entire population by using a suitable sampling design and information regarding the study and auxiliary variable is collected. Evidently, two phase sampling is useful if the auxiliary information is relatively easy and cheaper to collect than the study variable as well as if the strength of the relationship between the variables and is high. If the sample is selected in more than two phases, the resulting sampling design is called a multi-phase sampling. In this article we will consider how one can use data collected at the first phase sampling at the stages of estimation of the parameter, stratification, selection of sample and their combinations in the second phase in a unified setup applicable to any sampling design and wider classes of estimators. The problem of the estimation of variance will also be considered. The variance of estimator is essential for estimating precision of the survey estimates, calculation of confidence intervals, determination of the optimal sample sizes and for testing of hypotheses amongst others. Although, the variance is a non-negative quantity but its estimators may not be non-negative. If the estimator of variance is negative, then it cannot be used for estimation of confidence intervals, testing of hypothesis or measure of sampling error. The non-negativity properties of the variance estimators will also be studied in details.

Keywords: auxiliary information, two-phase sampling, varying probability sampling, unbiased estimators

Procedia PDF Downloads 495
3252 Application of GeoGebra into Teaching and Learning of Linear and Quadratic Equations amongst Senior Secondary School Students in Fagge Local Government Area of Kano State, Nigeria

Authors: Musa Auwal Mamman, S. G. Isa


This study was carried out in order to investigate the effectiveness of GeoGebra software in teaching and learning of linear and quadratic equations amongst senior secondary school students in Fagge Local Government Area, Kano State–Nigeria. Five research items were raised in objectives, research questions and hypotheses respectively. A random sampling method was used in selecting 398 students from a population of 2098 of SS2 students. The experimental group was taught using the GeoGebra software while the control group was taught using the conventional teaching method. The instrument used for the study was the mathematics performance test (MPT) which was administered at the beginning and at the end of the study. The results of the study revealed that students taught with GeoGebra software (experimental group) performed better than students taught with traditional teaching method. The t- test was used to analyze the data obtained from the study.

Keywords: GeoGebra Software, mathematics performance, random sampling, mathematics teaching

Procedia PDF Downloads 168
3251 Churn Prediction for Savings Bank Customers: A Machine Learning Approach

Authors: Prashant Verma


Commercial banks are facing immense pressure, including financial disintermediation, interest rate volatility and digital ways of finance. Retaining an existing customer is 5 to 25 less expensive than acquiring a new one. This paper explores customer churn prediction, based on various statistical & machine learning models and uses under-sampling, to improve the predictive power of these models. The results show that out of the various machine learning models, Random Forest which predicts the churn with 78% accuracy, has been found to be the most powerful model for the scenario. Customer vintage, customer’s age, average balance, occupation code, population code, average withdrawal amount, and an average number of transactions were found to be the variables with high predictive power for the churn prediction model. The model can be deployed by the commercial banks in order to avoid the customer churn so that they may retain the funds, which are kept by savings bank (SB) customers. The article suggests a customized campaign to be initiated by commercial banks to avoid SB customer churn. Hence, by giving better customer satisfaction and experience, the commercial banks can limit the customer churn and maintain their deposits.

Keywords: savings bank, customer churn, customer retention, random forests, machine learning, under-sampling

Procedia PDF Downloads 48
3250 Application of Combined Cluster and Discriminant Analysis to Make the Operation of Monitoring Networks More Economical

Authors: Norbert Magyar, Jozsef Kovacs, Peter Tanos, Balazs Trasy, Tamas Garamhegyi, Istvan Gabor Hatvani


Water is one of the most important common resources, and as a result of urbanization, agriculture, and industry it is becoming more and more exposed to potential pollutants. The prevention of the deterioration of water quality is a crucial role for environmental scientist. To achieve this aim, the operation of monitoring networks is necessary. In general, these networks have to meet many important requirements, such as representativeness and cost efficiency. However, existing monitoring networks often include sampling sites which are unnecessary. With the elimination of these sites the monitoring network can be optimized, and it can operate more economically. The aim of this study is to illustrate the applicability of the CCDA (Combined Cluster and Discriminant Analysis) to the field of water quality monitoring and optimize the monitoring networks of a river (the Danube), a wetland-lake system (Kis-Balaton & Lake Balaton), and two surface-subsurface water systems on the watershed of Lake Neusiedl/Lake Fertő and on the Szigetköz area over a period of approximately two decades. CCDA combines two multivariate data analysis methods: hierarchical cluster analysis and linear discriminant analysis. Its goal is to determine homogeneous groups of observations, in our case sampling sites, by comparing the goodness of preconceived classifications obtained from hierarchical cluster analysis with random classifications. The main idea behind CCDA is that if the ratio of correctly classified cases for a grouping is higher than at least 95% of the ratios for the random classifications, then at the level of significance (α=0.05) the given sampling sites don’t form a homogeneous group. Due to the fact that the sampling on the Lake Neusiedl/Lake Fertő was conducted at the same time at all sampling sites, it was possible to visualize the differences between the sampling sites belonging to the same or different groups on scatterplots. Based on the results, the monitoring network of the Danube yields redundant information over certain sections, so that of 12 sampling sites, 3 could be eliminated without loss of information. In the case of the wetland (Kis-Balaton) one pair of sampling sites out of 12, and in the case of Lake Balaton, 5 out of 10 could be discarded. For the groundwater system of the catchment area of Lake Neusiedl/Lake Fertő all 50 monitoring wells are necessary, there is no redundant information in the system. The number of the sampling sites on the Lake Neusiedl/Lake Fertő can decrease to approximately the half of the original number of the sites. Furthermore, neighbouring sampling sites were compared pairwise using CCDA and the results were plotted on diagrams or isoline maps showing the location of the greatest differences. These results can help researchers decide where to place new sampling sites. The application of CCDA proved to be a useful tool in the optimization of the monitoring networks regarding different types of water bodies. Based on the results obtained, the monitoring networks can be operated more economically.

Keywords: combined cluster and discriminant analysis, cost efficiency, monitoring network optimization, water quality

Procedia PDF Downloads 258