Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 14853

Search results for: binomial process

14853 Moment Estimators of the Parameters of Zero-One Inflated Negative Binomial Distribution

Authors: Rafid Saeed Abdulrazak Alshkaki

Abstract:

In this paper, zero-one inflated negative binomial distribution is considered, along with some of its structural properties, then its parameters were estimated using the method of moments. It is found that the method of moments to estimate the parameters of the zero-one inflated negative binomial models is not a proper method and may give incorrect conclusions.

Keywords: zero one inflated models, negative binomial distribution, moments estimator, non negative integer sampling

Procedia PDF Downloads 253

14852 Detecting Overdispersion for Mortality AIDS in Zero-inflated Negative Binomial Death Rate (ZINBDR) Co-infection Patients in Kelantan

Authors: Mohd Asrul Affedi, Nyi Nyi Naing

Abstract:

Overdispersion is present in count data, and basically when a phenomenon happened, a Negative Binomial (NB) is commonly used to replace a standard Poisson model. Analysis of count data event, such as mortality cases basically Poisson regression model is appropriate. Hence, the model is not appropriate when existing a zero values. The zero-inflated negative binomial model is appropriate. In this article, we modelled the mortality cases as a dependent variable by age categorical. The objective of this study to determine existing overdispersion in mortality data of AIDS co-infection patients in Kelantan.

Keywords: negative binomial death rate, overdispersion, zero-inflation negative binomial death rate, AIDS

Procedia PDF Downloads 435

14851 Primes as Sums and Differences of Two Binomial Coefficients and Two Powersums

Authors: Benjamin Lee Warren

Abstract:

Many problems exist in additive number theory which is essential to determine the primes that are the sum of two elements from a given single-variable polynomial sequence, and most of them are unattackable in the present day. Here, we determine solutions for this problem to a few certain sequences (certain binomial coefficients and power sums) using only elementary algebra and some algebraic factoring methods (as well as Euclid’s Lemma and Faulhaber’s Formula). In particular, we show that there are finitely many primes as sums of two of these types of elements. Several cases are fully illustrated, and bounds are presented for the cases not fully illustrated.

Keywords: binomial coefficients, power sums, primes, algebra

Procedia PDF Downloads 64

14850 Bayes Estimation of Parameters of Binomial Type Rayleigh Class Software Reliability Growth Model using Non-informative Priors

Authors: Rajesh Singh, Kailash Kale

Abstract:

In this paper, the Binomial process type occurrence of software failures is considered and failure intensity has been characterized by one parameter Rayleigh class Software Reliability Growth Model (SRGM). The proposed SRGM is mathematical function of parameters namely; total number of failures i.e. η-0 and scale parameter i.e. η-1. It is assumed that very little or no information is available about both these parameters and then considering non-informative priors for both these parameters, the Bayes estimators for the parameters η-0 and η-1 have been obtained under square error loss function. The proposed Bayes estimators are compared with their corresponding maximum likelihood estimators on the basis of risk efficiencies obtained by Monte Carlo simulation technique. It is concluded that both the proposed Bayes estimators of total number of failures and scale parameter perform well for proper choice of execution time.

Keywords: binomial process, non-informative prior, maximum likelihood estimator (MLE), rayleigh class, software reliability growth model (SRGM)

Procedia PDF Downloads 361

14849 Application of Hyperbinomial Distribution in Developing a Modified p-Chart

Authors: Shourav Ahmed, M. Gulam Kibria, Kais Zaman

Abstract:

Control charts graphically verify variation in quality parameters. Attribute type control charts deal with quality parameters that can only hold two states, e.g., good or bad, yes or no, etc. At present, p-control chart is most commonly used to deal with attribute type data. In construction of p-control chart using binomial distribution, the value of proportion non-conforming must be known or estimated from limited sample information. As the probability distribution of fraction non-conforming (p) is considered in hyperbinomial distribution unlike a constant value in case of binomial distribution, it reduces the risk of false detection. In this study, a statistical control chart is proposed based on hyperbinomial distribution when prior estimate of proportion non-conforming is unavailable and is estimated from limited sample information. We developed the control limits of the proposed modified p-chart using the mean and variance of hyperbinomial distribution. The proposed modified p-chart can also utilize additional sample information when they are available. The study also validates the use of modified p-chart by comparing with the result obtained using cumulative distribution function of hyperbinomial distribution. The study clearly indicates that the use of hyperbinomial distribution in construction of p-control chart yields much accurate estimate of quality parameters than using binomial distribution.

Keywords: binomial distribution, control charts, cumulative distribution function, hyper binomial distribution

Procedia PDF Downloads 237

14848 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

Authors: Mohd Asrul Affendi Bi Abdullah, Nyi Nyi Naing

Abstract:

The purpose of this study was to identify comparable manner between negative binomial death rate (NBDR) and zero inflated negative binomial death rate (ZINBDR) with died patients with (HIV + T B+) and (HIV + T B−). HIV and TB is a serious world wide problem in the developing country. Data were analyzed with applying NBDR and ZINBDR to make comparison which a favorable model is better to used. The ZINBDR model is able to account for the disproportionately large number of zero within the data and is shown to be a consistently better fit than the NBDR model. Hence, as a results ZINBDR model is a superior fit to the data than the NBDR model and provides additional information regarding the died mechanisms HIV+TB. The ZINBDR model is shown to be a use tool for analysis death rate according age categorical.

Keywords: zero inflated negative binomial death rate, HIV and TB, AIC and BIC, death rate

Procedia PDF Downloads 401

14847 Count Data Regression Modeling: An Application to Spontaneous Abortion in India

Authors: Prashant Verma, Prafulla K. Swain, K. K. Singh, Mukti Khetan

Abstract:

Objective: In India, around 20,000 women die every year due to abortion-related complications. In the modelling of count variables, there is sometimes a preponderance of zero counts. This article concerns the estimation of various count regression models to predict the average number of spontaneous abortion among women in the Punjab state of India. It also assesses the factors associated with the number of spontaneous abortions. Materials and methods: The study included 27,173 married women of Punjab obtained from the DLHS-4 survey (2012-13). Poisson regression (PR), Negative binomial (NB) regression, zero hurdle negative binomial (ZHNB), and zero-inflated negative binomial (ZINB) models were employed to predict the average number of spontaneous abortions and to identify the determinants affecting the number of spontaneous abortions. Results: Statistical comparisons among four estimation methods revealed that the ZINB model provides the best prediction for the number of spontaneous abortions. Antenatal care (ANC) place, place of residence, total children born to a woman, woman's education and economic status were found to be the most significant factors affecting the occurrence of spontaneous abortion. Conclusions: The study offers a practical demonstration of techniques designed to handle count variables. Statistical comparisons among four estimation models revealed that the ZINB model provided the best prediction for the number of spontaneous abortions and is recommended to be used to predict the number of spontaneous abortions. The study suggests that women receive institutional Antenatal care to attain limited parity. It also advocates promoting higher education among women in Punjab, India.

Keywords: count data, spontaneous abortion, Poisson model, negative binomial model, zero hurdle negative binomial, zero-inflated negative binomial, regression

Procedia PDF Downloads 125

14846 Optimal Hedging of a Portfolio of European Options in an Extended Binomial Model under Proportional Transaction Costs

Authors: Norm Josephy, Lucy Kimball, Victoria Steblovskaya

Abstract:

Hedging of a portfolio of European options under proportional transaction costs is considered. Our discrete time financial market model extends the binomial market model with transaction costs to the case where the underlying stock price ratios are distributed over a bounded interval rather than over a two-point set. An optimal hedging strategy is chosen from a set of admissible non-self-financing hedging strategies. Our approach to optimal hedging of a portfolio of options is based on theoretical foundation that includes determination of a no-arbitrage option price interval as well as on properties of the non-self-financing strategies and their residuals. A computational algorithm for optimizing an investor relevant criterion over the set of admissible non-self-financing hedging strategies is developed. Applicability of our approach is demonstrated using both simulated data and real market data.

Keywords: extended binomial model, non-self-financing hedging, optimization, proportional transaction costs

Procedia PDF Downloads 225

14845 Assessing Significance of Correlation with Binomial Distribution

Authors: Vijay Kumar Singh, Pooja Kushwaha, Prabhat Ranjan, Krishna Kumar Ojha, Jitendra Kumar

Abstract:

Present day high-throughput genomic technologies, NGS/microarrays, are producing large volume of data that require improved analysis methods to make sense of the data. The correlation between genes and samples has been regularly used to gain insight into many biological phenomena including, but not limited to, co-expression/co-regulation, gene regulatory networks, clustering and pattern identification. However, presence of outliers and violation of assumptions underlying Pearson correlation is frequent and may distort the actual correlation between the genes and lead to spurious conclusions. Here, we report a method to measure the strength of association between genes. The method assumes that the expression values of a gene are Bernoulli random variables whose outcome depends on the sample being probed. The method considers the two genes as uncorrelated if the number of sample with same outcome for both the genes (Ns) is equal to certainly expected number (Es). The extent of correlation depends on how far Ns can deviate from the Es. The method does not assume normality for the parent population, fairly unaffected by the presence of outliers, can be applied to qualitative data and it uses the binomial distribution to assess the significance of association. At this stage, we would not claim about the superiority of the method over other existing correlation methods, but our method could be another way of calculating correlation in addition to existing methods. The method uses binomial distribution, which has not been used until yet, to assess the significance of association between two variables. We are evaluating the performance of our method on NGS/microarray data, which is noisy and pierce by the outliers, to see if our method can differentiate between spurious and actual correlation. While working with the method, it has not escaped our notice that the method could also be generalized to measure the association of more than two variables which has been proven difficult with the existing methods.

Keywords: binomial distribution, correlation, microarray, outliers, transcriptome

Procedia PDF Downloads 382

14844 Air Pollution and Respiratory-Related Restricted Activity Days in Tunisia

Authors: Mokhtar Kouki Inès Rekik

Abstract:

This paper focuses on the assessment of the air pollution and morbidity relationship in Tunisia. Air pollution is measured by ozone air concentration and the morbidity is measured by the number of respiratory-related restricted activity days during the 2-week period prior to the interview. Socioeconomic data are also collected in order to adjust for any confounding covariates. Our sample is composed by 407 Tunisian respondents; 44.7% are women, the average age is 35.2, near 69% are living in a house built after the 1980, and 27.8% have reported at least one day of respiratory-related restricted activity. The model consists on the regression of the number of respiratory-related restricted activity days on the air quality measure and the socioeconomic covariates. In order to correct for zero-inflation and heterogeneity, we estimate several models (Poisson, Negative binomial, Zero inflated Poisson, Poisson hurdle, Negative binomial hurdle and finite mixture Poisson models). Bootstrapping and post-stratification techniques are used in order to correct for any sample bias. According to the Akaike information criteria, the hurdle negative binomial model has the greatest goodness of fit. The main result indicates that, after adjusting for socioeconomic data, the ozone concentration increases the probability of positive number of restricted activity days.

Keywords: bootstrapping, hurdle negbin model, overdispersion, ozone concentration, respiratory-related restricted activity days

Procedia PDF Downloads 229

14843 An EWMA P-Chart Based on Improved Square Root Transformation

Authors: Saowanit Sukparungsee

Abstract:

Generally, the traditional Shewhart p chart has been developed by for charting the binomial data. This chart has been developed using the normal approximation with condition as low defect level and the small to moderate sample size. In real applications, however, are away from these assumptions due to skewness in the exact distribution. In this paper, a modified Exponentially Weighted Moving Average (EWMA) control chat for detecting a change in binomial data by improving square root transformations, namely ISRT p EWMA control chart. The numerical results show that ISRT p EWMA chart is superior to ISRT p chart for small to moderate shifts, otherwise, the latter is better for large shifts.

Keywords: number of defects, exponentially weighted moving average, average run length, square root transformations

Procedia PDF Downloads 404

14842 Statistical Manufacturing Cell/Process Qualification Sample Size Optimization

Authors: Angad Arora

Abstract:

In production operations/manufacturing, a cell or line is typically a bunch of similar machines (computer numerical control (CNCs), advanced cutting, 3D printing or special purpose machines. For qualifying a typical manufacturing line /cell / new process, Ideally, we need a sample of parts that can be flown through the process and then we make a judgment on the health of the line/cell. However, with huge volumes and mass production scope, such as in the mobile phone industry, for example, the actual cells or lines can go in thousands and to qualify each one of them with statistical confidence means utilizing samples that are very large and eventually add to product /manufacturing cost + huge waste if the parts are not intended to be customer shipped. To solve this, we come up with 2 steps statistical approach. We start with a small sample size and then objectively evaluate whether the process needs additional samples or not. For example, if a process is producing bad parts and we saw those samples early, then there is a high chance that the process will not meet the desired yield and there is no point in keeping adding more samples. We used this hypothesis and came up with 2 steps binomial testing approach. Further, we also prove through results that we can achieve an 18-25% reduction in samples while keeping the same statistical confidence.

Keywords: statistics, data science, manufacturing process qualification, production planning

Procedia PDF Downloads 67

14841 Child Homicide Victimization and Community Context: A Research Note

Authors: Bohsiu Wu

Abstract:

Among serious crimes, child homicide is a rather rare event. However, the killing of children stirs up a special type of emotion in society that pales other criminal acts. This study examines the relevancy of three possible community-level explanations for child homicide: social deprivation, female empowerment, and social isolation. The social deprivation hypothesis posits that child homicide results from lack of resources in communities. The female empowerment hypothesis argues that a higher female status translates into a higher level of capability to prevent child homicide. Finally, the social isolation hypothesis regards child homicide as a result of lack of social connectivity. Child homicide data, aggregated by US postal ZIP codes in California from 1990 to 1999, were analyzed with a negative binomial regression. The results of the negative binomial analysis demonstrate that social deprivation is the most salient and consistent predictor among all other factors in explaining child homicide victimization at the ZIP-code level. Both social isolation and female labor force participation are weak predictors of child homicide victimization across communities. Further, results from the negative binomial regression show that it is the communities with a higher, not lower, degree of female labor force participation that are associated with a higher count of child homicide. It is possible that poor communities with a higher level of female employment have a lesser capacity to provide the necessary care and protection for the children. Policies aiming at reducing social deprivation and strengthening female empowerment possess the potential to reduce child homicide in the community.

Keywords: child homicide, deprivation, empowerment, isolation

Procedia PDF Downloads 164

14840 Identifying and Review of Effective Factors on Marketing Relationship In National Iranian Drilling Company from Managers’ View

Authors: Hoda Ghorbani

Abstract:

Today, many markets are matured and faced by a congested competition and amount of supply that is quite greater than demand. With respect to such modifications, organizations shall make themselves more equipped beforehand and ready to tackle with their rivals. In this regard, Relationship Marketing tries to lower the cost for attracting new customers by establishment and maintenance long run relations with the current customers and by which they try to increase corporative profitability. Consequently, identifying of relationship marketing and its effective factors is an essential element for maintenance of market and improvement of corporative competition potential. The present study deals with identifying the effective factors on marketing relationship in National Iranian Drilling Company (NIDC) from managers’ point of view. Methodology of this study is of descriptive- survey type. In addition to an extensive review on secondary sources and interview with experienced members in NIDC, researcher identified the related factors and distributed a questionnaire, including 31 questions, among 144 participants from corporative managers and first-rank principals. After gathering information, the related data have been analyzed by using binomial test as well as Binomial Analytic Hierarchy Process (AHP) of pair-wise comparisons. Study results showed that some variable like communication, commitment, Conflict Management and trust have affected on relationship marketing based on their order preference.

Keywords: marketing relationship, trust, commitment, communication, conflict management

Procedia PDF Downloads 341

14839 Transportation Accidents Mortality Modeling in Thailand

Authors: W. Sriwattanapongse, S. Prasitwattanaseree, S. Wongtrangan

Abstract:

The transportation accidents mortality is a major problem that leads to loss of human lives, and economic. The objective was to identify patterns of statistical modeling for estimating mortality rates due to transportation accidents in Thailand by using data from 2000 to 2009. The data was taken from the death certificate, vital registration database. The number of deaths and mortality rates were computed classifying by gender, age, year and region. There were 114,790 cases of transportation accidents deaths. The highest average age-specific transport accident mortality rate is 3.11 per 100,000 per year in males, Southern region and the lowest average age-specific transport accident mortality rate is 1.79 per 100,000 per year in females, North-East region. Linear, poisson and negative binomial models were chosen for fitting statistical model. Among the models fitted, the best was chosen based on the analysis of deviance and AIC. The negative binomial model was clearly appropriate fitted.

Keywords: transportation accidents, mortality, modeling, analysis of deviance

Procedia PDF Downloads 212

14838 Regression for Doubly Inflated Multivariate Poisson Distributions

Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta

Abstract:

Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.

Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios

Procedia PDF Downloads 131

14837 Geo-Additive Modeling of Family Size in Nigeria

Authors: Oluwayemisi O. Alaba, John O. Olaomi

Abstract:

The 2013 Nigerian Demographic Health Survey (NDHS) data was used to investigate the determinants of family size in Nigeria using the geo-additive model. The fixed effect of categorical covariates were modelled using the diffuse prior, P-spline with second-order random walk for the nonlinear effect of continuous variable, spatial effects followed Markov random field priors while the exchangeable normal priors were used for the random effects of the community and household. The Negative Binomial distribution was used to handle overdispersion of the dependent variable. Inference was fully Bayesian approach. Results showed a declining effect of secondary and higher education of mother, Yoruba tribe, Christianity, family planning, mother giving birth by caesarean section and having a partner who has secondary education on family size. Big family size is positively associated with age at first birth, number of daughters in a household, being gainfully employed, married and living with partner, community and household effects.

Keywords: Bayesian analysis, family size, geo-additive model, negative binomial

Procedia PDF Downloads 511

14836 A Comparison of Methods for Estimating Dichotomous Treatment Effects: A Simulation Study

Authors: Jacqueline Y. Thompson, Sam Watson, Lee Middleton, Karla Hemming

Abstract:

Introduction: The odds ratio (estimated via logistic regression) is a well-established and common approach for estimating covariate-adjusted binary treatment effects when comparing a treatment and control group with dichotomous outcomes. Its popularity is primarily because of its stability and robustness to model misspecification. However, the situation is different for the relative risk and risk difference, which are arguably easier to interpret and better suited to specific designs such as non-inferiority studies. So far, there is no equivalent, widely acceptable approach to estimate an adjusted relative risk and risk difference when conducting clinical trials. This is partly due to the lack of a comprehensive evaluation of available candidate methods. Methods/Approach: A simulation study is designed to evaluate the performance of relevant candidate methods to estimate relative risks to represent conditional and marginal estimation approaches. We consider the log-binomial, generalised linear models (GLM) with iteratively weighted least-squares (IWLS) and model-based standard errors (SE); log-binomial GLM with convex optimisation and model-based SEs; log-binomial GLM with convex optimisation and permutation tests; modified-Poisson GLM IWLS and robust SEs; log-binomial generalised estimation equations (GEE) and robust SEs; marginal standardisation and delta method SEs; and marginal standardisation and permutation test SEs. Independent and identically distributed datasets are simulated from a randomised controlled trial to evaluate these candidate methods. Simulations are replicated 10000 times for each scenario across all possible combinations of sample sizes (200, 1000, and 5000), outcomes (10%, 50%, and 80%), and covariates (ranging from -0.05 to 0.7) representing weak, moderate or strong relationships. Treatment effects (ranging from 0, -0.5, 1; on the log-scale) will consider null (H0) and alternative (H1) hypotheses to evaluate coverage and power in realistic scenarios. Performance measures (bias, mean square error (MSE), relative efficiency, and convergence rates) are evaluated across scenarios covering a range of sample sizes, event rates, covariate prognostic strength, and model misspecifications. Potential Results, Relevance & Impact: There are several methods for estimating unadjusted and adjusted relative risks. However, it is unclear which method(s) is the most efficient, preserves type-I error rate, is robust to model misspecification, or is the most powerful when adjusting for non-prognostic and prognostic covariates. GEE estimations may be biased when the outcome distributions are not from marginal binary data. Also, it seems that marginal standardisation and convex optimisation may perform better than GLM IWLS log-binomial.

Keywords: binary outcomes, statistical methods, clinical trials, simulation study

Procedia PDF Downloads 78

14835 A Comparative Analysis of Geometric and Exponential Laws in Modelling the Distribution of the Duration of Daily Precipitation

Authors: Mounia El Hafyani, Khalid El Himdi

Abstract:

Precipitation is one of the key variables in water resource planning. The importance of modeling wet and dry durations is a crucial pointer in engineering hydrology. The objective of this study is to model and analyze the distribution of wet and dry durations. For this purpose, the daily rainfall data from 1967 to 2017 of the Moroccan city of Kenitra’s station are used. Three models are implemented for the distribution of wet and dry durations, namely the first-order Markov chain, the second-order Markov chain, and the truncated negative binomial law. The adherence of the data to the proposed models is evaluated using Chi-square and Kolmogorov-Smirnov tests. The Akaike information criterion is applied to assess the most effective model distribution. We go further and study the law of the number of wet and dry days among k consecutive days. The calculation of this law is done through an algorithm that we have implemented based on conditional laws. We complete our work by comparing the observed moments of the numbers of wet/dry days among k consecutive days to the calculated moment of the three estimated models. The study shows the effectiveness of our approach in modeling wet and dry durations of daily precipitation.

Keywords: Markov chain, rainfall, truncated negative binomial law, wet and dry durations

Procedia PDF Downloads 87

14834 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 211

14833 Progressive Type-I Interval Censoring with Binomial Removal-Estimation and Its Properties

Authors: Sonal Budhiraja, Biswabrata Pradhan

Abstract:

This work considers statistical inference based on progressive Type-I interval censored data with random removal. The scheme of progressive Type-I interval censoring with random removal can be described as follows. Suppose n identical items are placed on a test at time T0 = 0 under k pre-fixed inspection times at pre-specified times T1 < T2 < . . . < Tk, where Tk is the scheduled termination time of the experiment. At inspection time Ti, Ri of the remaining surviving units Si, are randomly removed from the experiment. The removal follows a binomial distribution with parameters Si and pi for i = 1, . . . , k, with pk = 1. In this censoring scheme, the number of failures in different inspection intervals and the number of randomly removed items at pre-specified inspection times are observed. Asymptotic properties of the maximum likelihood estimators (MLEs) are established under some regularity conditions. A β-content γ-level tolerance interval (TI) is determined for two parameters Weibull lifetime model using the asymptotic properties of MLEs. The minimum sample size required to achieve the desired β-content γ-level TI is determined. The performance of the MLEs and TI is studied via simulation.

Keywords: asymptotic normality, consistency, regularity conditions, simulation study, tolerance interval

Procedia PDF Downloads 216

14832 A Generalization of Planar Pascal’s Triangle to Polynomial Expansion and Connection with Sierpinski Patterns

Authors: Wajdi Mohamed Ratemi

Abstract:

The very well-known stacked sets of numbers referred to as Pascal’s triangle present the coefficients of the binomial expansion of the form (x+y)n. This paper presents an approach (the Staircase Horizontal Vertical, SHV-method) to the generalization of planar Pascal’s triangle for polynomial expansion of the form (x+y+z+w+r+⋯)n. The presented generalization of Pascal’s triangle is different from other generalizations of Pascal’s triangles given in the literature. The coefficients of the generalized Pascal’s triangles, presented in this work, are generated by inspection, using embedded Pascal’s triangles. The coefficients of I-variables expansion are generated by horizontally laying out the Pascal’s elements of (I-1) variables expansion, in a staircase manner, and multiplying them with the relevant columns of vertically laid out classical Pascal’s elements, hence avoiding factorial calculations for generating the coefficients of the polynomial expansion. Furthermore, the classical Pascal’s triangle has some pattern built into it regarding its odd and even numbers. Such pattern is known as the Sierpinski’s triangle. In this study, a presentation of Sierpinski-like patterns of the generalized Pascal’s triangles is given. Applications related to those coefficients of the binomial expansion (Pascal’s triangle), or polynomial expansion (generalized Pascal’s triangles) can be in areas of combinatorics, and probabilities.

Keywords: pascal’s triangle, generalized pascal’s triangle, polynomial expansion, sierpinski’s triangle, combinatorics, probabilities

Procedia PDF Downloads 338

14831 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling over-dispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling over-dispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling over-dispersed medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling over-dispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian, and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling over-dispersed medical count data when ZIP and ZINB are inadequate.

Keywords: zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit

Procedia PDF Downloads 505

14830 Optimal Design of Step-Stress Partially Life Test Using Multiply Censored Exponential Data with Random Removals

Authors: Showkat Ahmad Lone, Ahmadur Rahman, Ariful Islam

Abstract:

The major assumption in accelerated life tests (ALT) is that the mathematical model relating the lifetime of a test unit and the stress are known or can be assumed. In some cases, such life–stress relationships are not known and cannot be assumed, i.e. ALT data cannot be extrapolated to use condition. So, in such cases, partially accelerated life test (PALT) is a more suitable test to be performed for which tested units are subjected to both normal and accelerated conditions. This study deals with estimating information about failure times of items under step-stress partially accelerated life tests using progressive failure-censored hybrid data with random removals. The life data of the units under test is considered to follow exponential life distribution. The removals from the test are assumed to have binomial distributions. The point and interval maximum likelihood estimations are obtained for unknown distribution parameters and tampering coefficient. An optimum test plan is developed using the D-optimality criterion. The performances of the resulting estimators of the developed model parameters are evaluated and investigated by using a simulation algorithm.

Keywords: binomial distribution, d-optimality, multiple censoring, optimal design, partially accelerated life testing, simulation study

Procedia PDF Downloads 296

14829 A Posterior Predictive Model-Based Control Chart for Monitoring Healthcare

Authors: Yi-Fan Lin, Peter P. Howley, Frank A. Tuyl

Abstract:

Quality measurement and reporting systems are used in healthcare internationally. In Australia, the Australian Council on Healthcare Standards records and reports hundreds of clinical indicators (CIs) nationally across the healthcare system. These CIs are measures of performance in the clinical setting, and are used as a screening tool to help assess whether a standard of care is being met. Existing analysis and reporting of these CIs incorporate Bayesian methods to address sampling variation; however, such assessments are retrospective in nature, reporting upon the previous six or twelve months of data. The use of Bayesian methods within statistical process control for monitoring systems is an important pursuit to support more timely decision-making. Our research has developed and assessed a new graphical monitoring tool, similar to a control chart, based on the beta-binomial posterior predictive (BBPP) distribution to facilitate the real-time assessment of health care organizational performance via CIs. The BBPP charts have been compared with the traditional Bernoulli CUSUM (BC) chart by simulation. The more traditional “central” and “highest posterior density” (HPD) interval approaches were each considered to define the limits, and the multiple charts were compared via in-control and out-of-control average run lengths (ARLs), assuming that the parameter representing the underlying CI rate (proportion of cases with an event of interest) required estimation. Preliminary results have identified that the BBPP chart with HPD-based control limits provides better out-of-control run length performance than the central interval-based and BC charts. Further, the BC chart’s performance may be improved by using Bayesian parameter estimation of the underlying CI rate.

Keywords: average run length (ARL), bernoulli cusum (BC) chart, beta binomial posterior predictive (BBPP) distribution, clinical indicator (CI), healthcare organization (HCO), highest posterior density (HPD) interval

Procedia PDF Downloads 179

14828 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder

Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen

Abstract:

Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.

Keywords: count data, meta-analytic prior, negative binomial, poisson

Procedia PDF Downloads 87

14827 A Framework for Auditing Multilevel Models Using Explainability Methods

Authors: Debarati Bhaumik, Diptish Dey

Abstract:

Multilevel models, increasingly deployed in industries such as insurance, food production, and entertainment within functions such as marketing and supply chain management, need to be transparent and ethical. Applications usually result in binary classification within groups or hierarchies based on a set of input features. Using open-source datasets, we demonstrate that popular explainability methods, such as SHAP and LIME, consistently underperform inaccuracy when interpreting these models. They fail to predict the order of feature importance, the magnitudes, and occasionally even the nature of the feature contribution (negative versus positive contribution to the outcome). Besides accuracy, the computational intractability of SHAP for binomial classification is a cause of concern. For transparent and ethical applications of these hierarchical statistical models, sound audit frameworks need to be developed. In this paper, we propose an audit framework for technical assessment of multilevel regression models focusing on three aspects: (i) model assumptions & statistical properties, (ii) model transparency using different explainability methods, and (iii) discrimination assessment. To this end, we undertake a quantitative approach and compare intrinsic model methods with SHAP and LIME. The framework comprises a shortlist of KPIs, such as PoCE (Percentage of Correct Explanations) and MDG (Mean Discriminatory Gap) per feature, for each of these three aspects. A traffic light risk assessment method is furthermore coupled to these KPIs. The audit framework will assist regulatory bodies in performing conformity assessments of AI systems using multilevel binomial classification models at businesses. It will also benefit businesses deploying multilevel models to be future-proof and aligned with the European Commission’s proposed Regulation on Artificial Intelligence.

Keywords: audit, multilevel model, model transparency, model explainability, discrimination, ethics

Procedia PDF Downloads 64

14826 Gender Estimation by Means of Quantitative Measurements of Foramen Magnum: An Analysis of CT Head Images

Authors: Thilini Hathurusinghe, Uthpalie Siriwardhana, W. M. Ediri Arachchi, Ranga Thudugala, Indeewari Herath, Gayani Senanayake

Abstract:

The foramen magnum is more prone to protect than other skeletal remains during high impact and severe disruptive injuries. Therefore, it is worthwhile to explore whether these measurements can be used to determine the human gender which is vital in forensic and anthropological studies. The idea was to find out the ability to use quantitative measurements of foramen magnum as an anatomical indicator for human gender estimation and to evaluate the gender-dependent variations of foramen magnum using quantitative measurements. Randomly selected 113 subjects who underwent CT head scans at Sri Jayawardhanapura General Hospital of Sri Lanka within a period of six months, were included in the study. The sample contained 58 males (48.76 ± 14.7 years old) and 55 females (47.04 ±15.9 years old). Maximum length of the foramen magnum (LFM), maximum width of the foramen magnum (WFM), minimum distance between occipital condyles (MnD) and maximum interior distance between occipital condyles (MxID) were measured. Further, AreaT and AreaR were also calculated. The gender was estimated using binomial logistic regression. The mean values of all explanatory variables (LFM, WFM, MnD, MxID, AreaT, and AreaR) were greater among male than female. All explanatory variables except MnD (p=0.669) were statistically significant (p < 0.05). Significant bivariate correlations were demonstrated by AreaT and AreaR with the explanatory variables. The results evidenced that WFM and MxID were the best measurements in predicting gender according to binomial logistic regression. The estimated model was: log (p/1-p) =10.391-0.136×MxID-0.231×WFM, where p is the probability of being a female. The classification accuracy given by the above model was 65.5%. The quantitative measurements of foramen magnum can be used as a reliable anatomical marker for human gender estimation in the Sri Lankan context.

Keywords: foramen magnum, forensic and anthropological studies, gender estimation, logistic regression

Procedia PDF Downloads 124

14825 Design and Application of a Model Eliciting Activity with Civil Engineering Students on Binomial Distribution to Solve a Decision Problem Based on Samples Data Involving Aspects of Randomness and Proportionality

Authors: Martha E. Aguiar-Barrera, Humberto Gutierrez-Pulido, Veronica Vargas-Alejo

Abstract:

Identifying and modeling random phenomena is a fundamental cognitive process to understand and transform reality. Recognizing situations governed by chance and giving them a scientific interpretation, without being carried away by beliefs or intuitions, is a basic training for citizens. Hence the importance of generating teaching-learning processes, supported using technology, paying attention to model creation rather than only executing mathematical calculations. In order to develop the student's knowledge about basic probability distributions and decision making; in this work a model eliciting activity (MEA) is reported. The intention was applying the Model and Modeling Perspective to design an activity related to civil engineering that would be understandable for students, while involving them in its solution. Furthermore, the activity should imply a decision-making challenge based on sample data, and the use of the computer should be considered. The activity was designed considering the six design principles for MEA proposed by Lesh and collaborators. These are model construction, reality, self-evaluation, model documentation, shareable and reusable, and prototype. The application and refinement of the activity was carried out during three school cycles in the Probability and Statistics class for Civil Engineering students at the University of Guadalajara. The analysis of the way in which the students sought to solve the activity was made using audio and video recordings, as well as with the individual and team reports of the students. The information obtained was categorized according to the activity phase (individual or team) and the category of analysis (sample, linearity, probability, distributions, mechanization, and decision-making). With the results obtained through the MEA, four obstacles have been identified to understand and apply the binomial distribution: the first one was the resistance of the student to move from the linear to the probabilistic model; the second one, the difficulty of visualizing (infering) the behavior of the population through the sample data; the third one, viewing the sample as an isolated event and not as part of a random process that must be viewed in the context of a probability distribution; and the fourth one, the difficulty of decision-making with the support of probabilistic calculations. These obstacles have also been identified in literature on the teaching of probability and statistics. Recognizing these concepts as obstacles to understanding probability distributions, and that these do not change after an intervention, allows for the modification of these interventions and the MEA. In such a way, the students may identify themselves the erroneous solutions when they carrying out the MEA. The MEA also showed to be democratic since several students who had little participation and low grades in the first units, improved their participation. Regarding the use of the computer, the RStudio software was useful in several tasks, for example in such as plotting the probability distributions and to exploring different sample sizes. In conclusion, with the models created to solve the MEA, the Civil Engineering students improved their probabilistic knowledge and understanding of fundamental concepts such as sample, population, and probability distribution.

Keywords: linear model, models and modeling, probability, randomness, sample

Procedia PDF Downloads 85

14824 The Latency-Amplitude Binomial of Waves Resulting from the Application of Evoked Potentials for the Diagnosis of Dyscalculia

Authors: Maria Isabel Garcia-Planas, Maria Victoria Garcia-Camba

Abstract:

Recent advances in cognitive neuroscience have allowed a step forward in perceiving the processes involved in learning from the point of view of the acquisition of new information or the modification of existing mental content. The evoked potentials technique reveals how basic brain processes interact to achieve adequate and flexible behaviours. The objective of this work, using evoked potentials, is to study if it is possible to distinguish if a patient suffers a specific type of learning disorder to decide the possible therapies to follow. The methodology used, is the analysis of the dynamics of different areas of the brain during a cognitive activity to find the relationships between the different areas analyzed in order to better understand the functioning of neural networks. Also, the latest advances in neuroscience have revealed the existence of different brain activity in the learning process that can be highlighted through the use of non-invasive, innocuous, low-cost and easy-access techniques such as, among others, the evoked potentials that can help to detect early possible neuro-developmental difficulties for their subsequent assessment and cure. From the study of the amplitudes and latencies of the evoked potentials, it is possible to detect brain alterations in the learning process specifically in dyscalculia, to achieve specific corrective measures for the application of personalized psycho pedagogical plans that allow obtaining an optimal integral development of the affected people.

Keywords: dyscalculia, neurodevelopment, evoked potentials, Learning disabilities, neural networks

Procedia PDF Downloads 109