Search results for: unbiased estimators
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 186

Search results for: unbiased estimators

156 A Modified Estimating Equations in Derivation of the Causal Effect on the Survival Time with Time-Varying Covariates

Authors: Yemane Hailu Fissuh, Zhongzhan Zhang

Abstract:

a systematic observation from a defined time of origin up to certain failure or censor is known as survival data. Survival analysis is a major area of interest in biostatistics and biomedical researches. At the heart of understanding, the most scientific and medical research inquiries lie for a causality analysis. Thus, the main concern of this study is to investigate the causal effect of treatment on survival time conditional to the possibly time-varying covariates. The theory of causality often differs from the simple association between the response variable and predictors. A causal estimation is a scientific concept to compare a pragmatic effect between two or more experimental arms. To evaluate an average treatment effect on survival outcome, the estimating equation was adjusted for time-varying covariates under the semi-parametric transformation models. The proposed model intuitively obtained the consistent estimators for unknown parameters and unspecified monotone transformation functions. In this article, the proposed method estimated an unbiased average causal effect of treatment on survival time of interest. The modified estimating equations of semiparametric transformation models have the advantage to include the time-varying effect in the model. Finally, the finite sample performance characteristics of the estimators proved through the simulation and Stanford heart transplant real data. To this end, the average effect of a treatment on survival time estimated after adjusting for biases raised due to the high correlation of the left-truncation and possibly time-varying covariates. The bias in covariates was restored, by estimating density function for left-truncation. Besides, to relax the independence assumption between failure time and truncation time, the model incorporated the left-truncation variable as a covariate. Moreover, the expectation-maximization (EM) algorithm iteratively obtained unknown parameters and unspecified monotone transformation functions. To summarize idea, the ratio of cumulative hazards functions between the treated and untreated experimental group has a sense of the average causal effect for the entire population.

Keywords: a modified estimation equation, causal effect, semiparametric transformation models, survival analysis, time-varying covariate

Procedia PDF Downloads 138
155 Proficient Estimation Procedure for a Rare Sensitive Attribute Using Poisson Distribution

Authors: S. Suman, G. N. Singh

Abstract:

The present manuscript addresses the estimation procedure of population parameter using Poisson probability distribution when characteristic under study possesses a rare sensitive attribute. The generalized form of unrelated randomized response model is suggested in order to acquire the truthful responses from respondents. The resultant estimators have been proposed for two situations when the information on an unrelated rare non-sensitive characteristic is known as well as unknown. The properties of the proposed estimators are derived, and the measure of confidentiality of respondent is also suggested for respondents. Empirical studies are carried out in the support of discussed theory.

Keywords: Poisson distribution, randomized response model, rare sensitive attribute, non-sensitive attribute

Procedia PDF Downloads 233
154 Efficient Principal Components Estimation of Large Factor Models

Authors: Rachida Ouysse

Abstract:

This paper proposes a constrained principal components (CnPC) estimator for efficient estimation of large-dimensional factor models when errors are cross sectionally correlated and the number of cross-sections (N) may be larger than the number of observations (T). Although principal components (PC) method is consistent for any path of the panel dimensions, it is inefficient as the errors are treated to be homoskedastic and uncorrelated. The new CnPC exploits the assumption of bounded cross-sectional dependence, which defines Chamberlain and Rothschild’s (1983) approximate factor structure, as an explicit constraint and solves a constrained PC problem. The CnPC method is computationally equivalent to the PC method applied to a regularized form of the data covariance matrix. Unlike maximum likelihood type methods, the CnPC method does not require inverting a large covariance matrix and thus is valid for panels with N ≥ T. The paper derives a convergence rate and an asymptotic normality result for the CnPC estimators of the common factors. We provide feasible estimators and show in a simulation study that they are more accurate than the PC estimator, especially for panels with N larger than T, and the generalized PC type estimators, especially for panels with N almost as large as T.

Keywords: high dimensionality, unknown factors, principal components, cross-sectional correlation, shrinkage regression, regularization, pseudo-out-of-sample forecasting

Procedia PDF Downloads 121
153 Robust Shrinkage Principal Component Parameter Estimator for Combating Multicollinearity and Outliers’ Problems in a Poisson Regression Model

Authors: Arum Kingsley Chinedu, Ugwuowo Fidelis Ifeanyi, Oranye Henrietta Ebele

Abstract:

The Poisson regression model (PRM) is a nonlinear model that belongs to the exponential family of distribution. PRM is suitable for studying count variables using appropriate covariates and sometimes experiences the problem of multicollinearity in the explanatory variables and outliers on the response variable. This study aims to address the problem of multicollinearity and outliers jointly in a Poisson regression model. We developed an estimator called the robust modified jackknife PCKL parameter estimator by combining the principal component estimator, modified jackknife KL and transformed M-estimator estimator to address both problems in a PRM. The superiority conditions for this estimator were established, and the properties of the estimator were also derived. The estimator inherits the characteristics of the combined estimators, thereby making it efficient in addressing both problems. And will also be of immediate interest to the research community and advance this study in terms of novelty compared to other studies undertaken in this area. The performance of the estimator (robust modified jackknife PCKL) with other existing estimators was compared using mean squared error (MSE) as a performance evaluation criterion through a Monte Carlo simulation study and the use of real-life data. The results of the analytical study show that the estimator outperformed other existing estimators compared with by having the smallest MSE across all sample sizes, different levels of correlation, percentages of outliers and different numbers of explanatory variables.

Keywords: jackknife modified KL, outliers, multicollinearity, principal component, transformed M-estimator.

Procedia PDF Downloads 24
152 Capture-recapture to Estimate Completeness of Pulmonary Tuberculosis with Two Sources

Authors: Ratchadaporn Ungcharoen, Lily Ingsrisawang

Abstract:

Capture-recapture methods are popular techniques for indirect estimation the size of wildlife populations and the completeness of cases in epidemiology and social sciences. The aim of this study was to estimate the completeness of pulmonary tuberculosis cases confirmed by two sources of hospital registrations and surveillance systems in 2013 in Nakhon Pathom province, Thailand. Several estimators of population size were considered: the Lincoln-Petersen estimator, the Chapman estimator, the Chao’s lower bound estimator, the Zelterman’s estimator, etc. We focus on the Chapman and Chao’s lower bound estimators for estimating the completeness of pulmonary tuberculosis from two sources. The retrieved pulmonary tuberculosis data from two sources were analyzed and bootstrapped for 30 samples, with 241 observations from source 1 and 305 observations from source 2 per sample, for additional exploration of the completeness of pulmonary tuberculosis. The results from the original data show that the Chapman’s estimator gave the estimation of a total 360 (95% CI: 349-371) pulmonary tuberculosis cases, resulting in 57% estimated completeness cases. But the Chao’s lower bound estimator estimated the total of 365 (95% CI: 354-376) pulmonary tuberculosis cases and its estimated completeness cases was 55.9%. For the results from bootstrap samples, the Chapman and the Chao’s lower bound estimators gave an estimated 347 (95% CI: 309-385) and 353 (95% CI: 315-390) pulmonary tuberculosis cases, respectively. If for two sources recoding systems are available, record-linkage and capture-recapture analysis can be useful for estimating the completeness of different registration system. Both Chapman and Chao’s lower bound estimator approaches produce very close estimates.

Keywords: capture-recapture, Chao, Chapman, pulmonary tuberculosis

Procedia PDF Downloads 497
151 Evaluation of Sensor Pattern Noise Estimators for Source Camera Identification

Authors: Benjamin Anderson-Sackaney, Amr Abdel-Dayem

Abstract:

This paper presents a comprehensive survey of recent source camera identification (SCI) systems. Then, the performance of various sensor pattern noise (SPN) estimators was experimentally assessed, under common photo response non-uniformity (PRNU) frameworks. The experiments used 1350 natural and 900 flat-field images, captured by 18 individual cameras. 12 different experiments, grouped into three sets, were conducted. The results were analyzed using the receiver operator characteristic (ROC) curves. The experimental results demonstrated that combining the basic SPN estimator with a wavelet-based filtering scheme provides promising results. However, the phase SPN estimator fits better with both patch-based (BM3D) and anisotropic diffusion (AD) filtering schemes.

Keywords: sensor pattern noise, source camera identification, photo response non-uniformity, anisotropic diffusion, peak to correlation energy ratio

Procedia PDF Downloads 414
150 Need of Medicines Information OPD in Tertiary Health Care Settings: A Cross Sectional Study

Authors: Swanand Pathak, Kiran R. Giri, Reena R. Giri, Kamlesh Palandurkar, Sangita Totade, Rajesh Jha, S. S. Patel

Abstract:

Background: Population burden, illiteracy, availability of few doctors for larger group of population leads to many unanswered questions left in a patient’s mind. Incomplete information results into noncompliance, therapeutic failure, and adverse drug reactions (ADR). It is very important to establish a system which will provide noncommercial, independent, unbiased source of medicine information. Medicines Info OPD is a concept and step towards safe and appropriate use of medicines. Objective: (1) to assess the present status of knowledge about the medicines in the patients and its correlation with education; (2) to assess the medicine information dispensing modalities, their use and sufficiency from the patients view point; (3) to assess the overall need for Medicines Information OPD in present scenario. Materials and Methods: A pre-validated questionnaire based study was conducted amongst 500 patients of tertiary health care hospital. The questionnaire consisted of specific questions regarding understanding of prescription, knowledge about adverse drug reaction, view about self-medication and opinion regarding the need of Medicines Info OPD. Results: Significantly large proportion of patients opined that doctors do not have sufficient time in current Indian healthcare to explain the prescription and they are not aware of adverse drug reactions, expiry date or use the package inserts etc. Conclusion: Clinically relevant, up to date, user specific, independent, objective and unbiased Medicines Info OPD is essential for appropriate drug use and can help in a big way to common public to address many problems faced by them.

Keywords: information, prescription, unbiased, clinically relevant

Procedia PDF Downloads 407
149 Estimation of Rare and Clustered Population Mean Using Two Auxiliary Variables in Adaptive Cluster Sampling

Authors: Muhammad Nouman Qureshi, Muhammad Hanif

Abstract:

Adaptive cluster sampling (ACS) is specifically developed for the estimation of highly clumped populations and applied to a wide range of situations like animals of rare and endangered species, uneven minerals, HIV patients and drug users. In this paper, we proposed a generalized semi-exponential estimator with two auxiliary variables under the framework of ACS design. The expressions of approximate bias and mean square error (MSE) of the proposed estimator are derived. Theoretical comparisons of the proposed estimator have been made with existing estimators. A numerical study is conducted on real and artificial populations to demonstrate and compare the efficiencies of the proposed estimator. The results indicate that the proposed generalized semi-exponential estimator performed considerably better than all the adaptive and non-adaptive estimators considered in this paper.

Keywords: auxiliary information, adaptive cluster sampling, clustered populations, Hansen-Hurwitz estimation

Procedia PDF Downloads 201
148 Channel Estimation/Equalization with Adaptive Modulation and Coding over Multipath Faded Channels for WiMAX

Authors: B. Siva Kumar Reddy, B. Lakshmi

Abstract:

WiMAX has adopted an Adaptive Modulation and Coding (AMC) in OFDM to endure higher data rates and error free transmission. AMC schemes employ the Channel State Information (CSI) to efficiently utilize the channel and maximize the throughput and for better spectral efficiency. This CSI has given to the transmitter by the channel estimators. In this paper, LSE (Least Square Error) and MMSE (Minimum Mean square Error) estimators are suggested and BER (Bit Error Rate) performance has been analyzed. Channel equalization is also integrated with with AMC-OFDM system and presented with Constant Modulus Algorithm (CMA) and Least Mean Square (LMS) algorithms with convergence rates analysis. Simulation results proved that increment in modulation scheme size causes to improvement in throughput along with BER value. There is a trade-off among modulation size, throughput, BER value and spectral efficiency. Results also reported the requirement of channel estimation and equalization in high data rate systems.

Keywords: AMC, CSI, CMA, OFDM, OFDMA, WiMAX

Procedia PDF Downloads 370
147 Model Averaging in a Multiplicative Heteroscedastic Model

Authors: Alan Wan

Abstract:

In recent years, the body of literature on frequentist model averaging in statistics has grown significantly. Most of this work focuses on models with different mean structures but leaves out the variance consideration. In this paper, we consider a regression model with multiplicative heteroscedasticity and develop a model averaging method that combines maximum likelihood estimators of unknown parameters in both the mean and variance functions of the model. Our weight choice criterion is based on a minimisation of a plug-in estimator of the model average estimator's squared prediction risk. We prove that the new estimator possesses an asymptotic optimality property. Our investigation of finite-sample performance by simulations demonstrates that the new estimator frequently exhibits very favourable properties compared to some existing heteroscedasticity-robust model average estimators. The model averaging method hedges against the selection of very bad models and serves as a remedy to variance function misspecification, which often discourages practitioners from modeling heteroscedasticity altogether. The proposed model average estimator is applied to the analysis of two real data sets.

Keywords: heteroscedasticity-robust, model averaging, multiplicative heteroscedasticity, plug-in, squared prediction risk

Procedia PDF Downloads 336
146 Minimizing the Impact of Covariate Detection Limit in Logistic Regression

Authors: Shahadut Hossain, Jacek Wesolowski, Zahirul Hoque

Abstract:

In many epidemiological and environmental studies covariate measurements are subject to the detection limit. In most applications, covariate measurements are usually truncated from below which is known as left-truncation. Because the measuring device, which we use to measure the covariate, fails to detect values falling below the certain threshold. In regression analyses, it causes inflated bias and inaccurate mean squared error (MSE) to the estimators. This paper suggests a response-based regression calibration method to correct the deleterious impact introduced by the covariate detection limit in the estimators of the parameters of simple logistic regression model. Compared to the maximum likelihood method, the proposed method is computationally simpler, and hence easier to implement. It is robust to the violation of distributional assumption about the covariate of interest. In producing correct inference, the performance of the proposed method compared to the other competing methods has been investigated through extensive simulations. A real-life application of the method is also shown using data from a population-based case-control study of non-Hodgkin lymphoma.

Keywords: environmental exposure, detection limit, left truncation, bias, ad-hoc substitution

Procedia PDF Downloads 211
145 Bayesian Estimation under Different Loss Functions Using Gamma Prior for the Case of Exponential Distribution

Authors: Md. Rashidul Hasan, Atikur Rahman Baizid

Abstract:

The Bayesian estimation approach is a non-classical estimation technique in statistical inference and is very useful in real world situation. The aim of this paper is to study the Bayes estimators of the parameter of exponential distribution under different loss functions and then compared among them as well as with the classical estimator named maximum likelihood estimator (MLE). In our real life, we always try to minimize the loss and we also want to gather some prior information (distribution) about the problem to solve it accurately. Here the gamma prior is used as the prior distribution of exponential distribution for finding the Bayes estimator. In our study, we also used different symmetric and asymmetric loss functions such as squared error loss function, quadratic loss function, modified linear exponential (MLINEX) loss function and non-linear exponential (NLINEX) loss function. Finally, mean square error (MSE) of the estimators are obtained and then presented graphically.

Keywords: Bayes estimator, maximum likelihood estimator (MLE), modified linear exponential (MLINEX) loss function, Squared Error (SE) loss function, non-linear exponential (NLINEX) loss function

Procedia PDF Downloads 355
144 Statistical Inferences for GQARCH-It\^{o} - Jumps Model Based on The Realized Range Volatility

Authors: Fu Jinyu, Lin Jinguan

Abstract:

This paper introduces a novel approach that unifies two types of models: one is the continuous-time jump-diffusion used to model high-frequency data, and the other is discrete-time GQARCH employed to model low-frequency financial data by embedding the discrete GQARCH structure with jumps in the instantaneous volatility process. This model is named “GQARCH-It\^{o} -Jumps mode.” We adopt the realized range-based threshold estimation for high-frequency financial data rather than the realized return-based volatility estimators, which entail the loss of intra-day information of the price movement. Meanwhile, a quasi-likelihood function for the low-frequency GQARCH structure with jumps is developed for the parametric estimate. The asymptotic theories are mainly established for the proposed estimators in the case of finite activity jumps. Moreover, simulation studies are implemented to check the finite sample performance of the proposed methodology. Specifically, it is demonstrated that how our proposed approaches can be practically used on some financial data.

Keywords: It\^{o} process, GQARCH, leverage effects, threshold, realized range-based volatility estimator, quasi-maximum likelihood estimate

Procedia PDF Downloads 126
143 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 110
142 On Estimating the Low Income Proportion with Several Auxiliary Variables

Authors: Juan F. Muñoz-Rosas, Rosa M. García-Fernández, Encarnación Álvarez-Verdejo, Pablo J. Moya-Fernández

Abstract:

Poverty measurement is a very important topic in many studies in social sciences. One of the most important indicators when measuring poverty is the low income proportion. This indicator gives the proportion of people of a population classified as poor. This indicator is generally unknown, and for this reason, it is estimated by using survey data, which are obtained by official surveys carried out by many statistical agencies such as Eurostat. The main feature of the mentioned survey data is the fact that they contain several variables. The variable used to estimate the low income proportion is called as the variable of interest. The survey data may contain several additional variables, also named as the auxiliary variables, related to the variable of interest, and if this is the situation, they could be used to improve the estimation of the low income proportion. In this paper, we use Monte Carlo simulation studies to analyze numerically the performance of estimators based on several auxiliary variables. In this simulation study, we considered real data sets obtained from the 2011 European Union Survey on Income and Living Condition. Results derived from this study indicate that the estimators based on auxiliary variables are more accurate than the naive estimator.

Keywords: inclusion probability, poverty, poverty line, survey sampling

Procedia PDF Downloads 423
141 Inference for Compound Truncated Poisson Lognormal Model with Application to Maximum Precipitation Data

Authors: M. Z. Raqab, Debasis Kundu, M. A. Meraou

Abstract:

In this paper, we have analyzed maximum precipitation data during a particular period of time obtained from different stations in the Global Historical Climatological Network of the USA. One important point to mention is that some stations are shut down on certain days for some reason or the other. Hence, the maximum values are recorded by excluding those readings. It is assumed that the number of stations that operate follows zero-truncated Poisson random variables, and the daily precipitation follows a lognormal random variable. We call this model a compound truncated Poisson lognormal model. The proposed model has three unknown parameters, and it can take a variety of shapes. The maximum likelihood estimators can be obtained quite conveniently using Expectation-Maximization (EM) algorithm. Approximate maximum likelihood estimators are also derived. The associated confidence intervals also can be obtained from the observed Fisher information matrix. Simulation results have been performed to check the performance of the EM algorithm, and it is observed that the EM algorithm works quite well in this case. When we analyze the precipitation data set using the proposed model, it is observed that the proposed model provides a better fit than some of the existing models.

Keywords: compound Poisson lognormal distribution, EM algorithm, maximum likelihood estimation, approximate maximum likelihood estimation, Fisher information, skew distribution

Procedia PDF Downloads 82
140 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classification

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 299
139 Fathers' Knowledge and Attitude towards Breastfeeding: A Cross Sectional Study

Authors: Jacqueline R. Llamas, Agnes Regal

Abstract:

Objective: To determine the breastfeeding knowledge and attitudes of fathers seen at the University of Santo Tomas Hospital. Design: Cross-sectional design. Setting: University of Santo Tomas Hospital (USTH). Participants: 156 fathers who were accompanying their wives/children at the USTH. Findings: Outcome of the Iowa Infant Feeding Attitude Scale showed fathers to be generally unbiased whether their child be fed breast milk or milk formula. About 85% agreed that breast milk is the ideal food for babies, 79% believed that breastfed babies are healthier than formula fed and 55% of them do not believe that breast milk lacks iron. About 80% agreed that it is easily digested, 87% are aware of the economical value and 57% agreed of its convenience. Breastfeeding support was noted when 55% of the fathers would encourage mothers to breastfeed so as not to miss the joys of motherhood, 91% believed that breastfeeding increased mother-infant bonding. About 57% do not feel left out whenever the mothers breastfeed. However, 46.6% support the decision of their wives to switch to formula feeding once they go back to work, 42% only find breastfeeding in public to be acceptable and 57% will not allow breast feeding to mothers who drink alcohol. Conclusion: In the study, although fathers’ attitude toward breastfeeding is unbiased towards breastfeeding or formula feeding, the majority of the fathers appreciate breastfeeding and its benefits. Also, how the father’s level of education, age, profession, household income and number of children had an effect on their attitude towards breastfeeding.

Keywords: father, breastfeeding, breast milk, knowledge

Procedia PDF Downloads 391
138 The Reproducibility and Repeatability of Modified Likelihood Ratio for Forensics Handwriting Examination

Authors: O. Abiodun Adeyinka, B. Adeyemo Adesesan

Abstract:

The forensic use of handwriting depends on the analysis, comparison, and evaluation decisions made by forensic document examiners. When using biometric technology in forensic applications, it is necessary to compute Likelihood Ratio (LR) for quantifying strength of evidence under two competing hypotheses, namely the prosecution and the defense hypotheses wherein a set of assumptions and methods for a given data set will be made. It is therefore important to know how repeatable and reproducible our estimated LR is. This paper evaluated the accuracy and reproducibility of examiners' decisions. Confidence interval for the estimated LR were presented so as not get an incorrect estimate that will be used to deliver wrong judgment in the court of Law. The estimate of LR is fundamentally a Bayesian concept and we used two LR estimators, namely Logistic Regression (LoR) and Kernel Density Estimator (KDE) for this paper. The repeatability evaluation was carried out by retesting the initial experiment after an interval of six months to observe whether examiners would repeat their decisions for the estimated LR. The experimental results, which are based on handwriting dataset, show that LR has different confidence intervals which therefore implies that LR cannot be estimated with the same certainty everywhere. Though the LoR performed better than the KDE when tested using the same dataset, the two LR estimators investigated showed a consistent region in which LR value can be estimated confidently. These two findings advance our understanding of LR when used in computing the strength of evidence in handwriting using forensics.

Keywords: confidence interval, handwriting, kernel density estimator, KDE, logistic regression LoR, repeatability, reproducibility

Procedia PDF Downloads 95
137 Quasi-Photon Monte Carlo on Radiative Heat Transfer: An Importance Sampling and Learning Approach

Authors: Utkarsh A. Mishra, Ankit Bansal

Abstract:

At high temperature, radiative heat transfer is the dominant mode of heat transfer. It is governed by various phenomena such as photon emission, absorption, and scattering. The solution of the governing integrodifferential equation of radiative transfer is a complex process, more when the effect of participating medium and wavelength properties are taken into consideration. Although a generic formulation of such radiative transport problem can be modeled for a wide variety of problems with non-gray, non-diffusive surfaces, there is always a trade-off between simplicity and accuracy of the problem. Recently, solutions of complicated mathematical problems with statistical methods based on randomization of naturally occurring phenomena have gained significant importance. Photon bundles with discrete energy can be replicated with random numbers describing the emission, absorption, and scattering processes. Photon Monte Carlo (PMC) is a simple, yet powerful technique, to solve radiative transfer problems in complicated geometries with arbitrary participating medium. The method, on the one hand, increases the accuracy of estimation, and on the other hand, increases the computational cost. The participating media -generally a gas, such as CO₂, CO, and H₂O- present complex emission and absorption spectra. To model the emission/absorption accurately with random numbers requires a weighted sampling as different sections of the spectrum carries different importance. Importance sampling (IS) was implemented to sample random photon of arbitrary wavelength, and the sampled data provided unbiased training of MC estimators for better results. A better replacement to uniform random numbers is using deterministic, quasi-random sequences. Halton, Sobol, and Faure Low-Discrepancy Sequences are used in this study. They possess better space-filling performance than the uniform random number generator and gives rise to a low variance, stable Quasi-Monte Carlo (QMC) estimators with faster convergence. An optimal supervised learning scheme was further considered to reduce the computation costs of the PMC simulation. A one-dimensional plane-parallel slab problem with participating media was formulated. The history of some randomly sampled photon bundles is recorded to train an Artificial Neural Network (ANN), back-propagation model. The flux was calculated using the standard quasi PMC and was considered to be the training target. Results obtained with the proposed model for the one-dimensional problem are compared with the exact analytical and PMC model with the Line by Line (LBL) spectral model. The approximate variance obtained was around 3.14%. Results were analyzed with respect to time and the total flux in both cases. A significant reduction in variance as well a faster rate of convergence was observed in the case of the QMC method over the standard PMC method. However, the results obtained with the ANN method resulted in greater variance (around 25-28%) as compared to the other cases. There is a great scope of machine learning models to help in further reduction of computation cost once trained successfully. Multiple ways of selecting the input data as well as various architectures will be tried such that the concerned environment can be fully addressed to the ANN model. Better results can be achieved in this unexplored domain.

Keywords: radiative heat transfer, Monte Carlo Method, pseudo-random numbers, low discrepancy sequences, artificial neural networks

Procedia PDF Downloads 186
136 Finite-Sum Optimization: Adaptivity to Smoothness and Loopless Variance Reduction

Authors: Bastien Batardière, Joon Kwon

Abstract:

For finite-sum optimization, variance-reduced gradient methods (VR) compute at each iteration the gradient of a single function (or of a mini-batch), and yet achieve faster convergence than SGD thanks to a carefully crafted lower-variance stochastic gradient estimator that reuses past gradients. Another important line of research of the past decade in continuous optimization is the adaptive algorithms such as AdaGrad, that dynamically adjust the (possibly coordinate-wise) learning rate to past gradients and thereby adapt to the geometry of the objective function. Variants such as RMSprop and Adam demonstrate outstanding practical performance that have contributed to the success of deep learning. In this work, we present AdaLVR, which combines the AdaGrad algorithm with loopless variance-reduced gradient estimators such as SAGA or L-SVRG that benefits from a straightforward construction and a streamlined analysis. We assess that AdaLVR inherits both good convergence properties from VR methods and the adaptive nature of AdaGrad: in the case of L-smooth convex functions we establish a gradient complexity of O(n + (L + √ nL)/ε) without prior knowledge of L. Numerical experiments demonstrate the superiority of AdaLVR over state-of-the-art methods. Moreover, we empirically show that the RMSprop and Adam algorithm combined with variance-reduced gradients estimators achieve even faster convergence.

Keywords: convex optimization, variance reduction, adaptive algorithms, loopless

Procedia PDF Downloads 26
135 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo

Abstract:

Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 107
134 Stereoselective Glycosylation and Functionalization of Unbiased Site of Sweet System via Dual-Catalytic Transition Metal Systems/Wittig Reaction

Authors: Mukul R. Gupta, Rajkumar Gandhi, Rajitha Sachan, Naveen K. Khare

Abstract:

The field of glycoscience has burgeoned in the last several decades, leading to the identification of many glycosides which could serve critical roles in a wide range of biological processes. This has prompted a resurgence in synthetic interest, with a particular focus on new approaches to construct the selective glycosidic bond. Despite the numerous elegant strategies and methods developed for the formation of glycosidic bonds, stereoselective construction of glycosides remains challenging. Here, we have recently developed the novel Hexafluoroisopropanol (HFIP) catalyzed stereoselective glycosylation methods by using KDN imidate glycosyl donor and a variety of alcohols in excellent yield. This method is broadly applicable to a wide range of substrates and with excellent selectivity of glycoside. Also, herein we are reporting the functionalization of the unbiased side of newly formed glycosides by dual-catalytic transition metal systems (Ru- or Fe-). We are using the innovative Reverse & Catalyst strategy, i.e., a reversible activation reaction by one catalyst with a functionalization reaction by another catalyst, together with enabling functionalization of substrates at their inherently unreactive sites. As well, we are targeting the diSia derivative synthesis by Wittig reaction. This synthetic method is applicable in mild conditions, functional group tolerance of the dual-catalytic systems and also highlights the potential of the multicatalytic approach to address challenging transformations to avoid multistep procedures in carbohydrate synthesis.

Keywords: KDN, stereoselective glycosylation, dual-catalytic functionalization, Wittig reaction

Procedia PDF Downloads 163
133 An Unbiased Profiling of Immune Repertoire via Sequencing and Analyzing T-Cell Receptor Genes

Authors: Yi-Lin Chen, Sheng-Jou Hung, Tsunglin Liu

Abstract:

Adaptive immune system recognizes a wide range of antigens via expressing a large number of structurally distinct T cell and B cell receptor genes. The distinct receptor genes arise from complex rearrangements called V(D)J recombination, and constitute the immune repertoire. A common method of profiling immune repertoire is via amplifying recombined receptor genes using multiple primers and high-throughput sequencing. This multiplex-PCR approach is efficient; however, the resulting repertoire can be distorted because of primer bias. To eliminate primer bias, 5’ RACE is an alternative amplification approach. However, the application of RACE approach is limited by its low efficiency (i.e., the majority of data are non-regular receptor sequences, e.g., containing intronic segments) and lack of the convenient tool for analysis. We propose a computational tool that can correctly identify non-regular receptor sequences in RACE data via aligning receptor sequences against the whole gene instead of only the exon regions as done in all other tools. Using our tool, the remaining regular data allow for an accurate profiling of immune repertoire. In addition, a RACE approach is improved to yield a higher fraction of regular T-cell receptor sequences. Finally, we quantify the degree of primer bias of a multiplex-PCR approach via comparing it to the RACE approach. The results reveal significant differences in frequency of VJ combination by the two approaches. Together, we provide a new experimental and computation pipeline for an unbiased profiling of immune repertoire. As immune repertoire profiling has many applications, e.g., tracing bacterial and viral infection, detection of T cell lymphoma and minimal residual disease, monitoring cancer immunotherapy, etc., our work should benefit scientists who are interested in the applications.

Keywords: immune repertoire, T-cell receptor, 5' RACE, high-throughput sequencing, sequence alignment

Procedia PDF Downloads 161
132 Analysis of Filtering in Stochastic Systems on Continuous- Time Memory Observations in the Presence of Anomalous Noises

Authors: S. Rozhkova, O. Rozhkova, A. Harlova, V. Lasukov

Abstract:

For optimal unbiased filter as mean-square and in the case of functioning anomalous noises in the observation memory channel, we have proved insensitivity of filter to inaccurate knowledge of the anomalous noise intensity matrix and its equivalence to truncated filter plotted only by non anomalous components of an observation vector.

Keywords: mathematical expectation, filtration, anomalous noise, memory

Procedia PDF Downloads 334
131 An Estimating Equation for Survival Data with a Possibly Time-Varying Covariates under a Semiparametric Transformation Models

Authors: Yemane Hailu Fissuh, Zhongzhan Zhang

Abstract:

An estimating equation technique is an alternative method of the widely used maximum likelihood methods, which enables us to ease some complexity due to the complex characteristics of time-varying covariates. In the situations, when both the time-varying covariates and left-truncation are considered in the model, the maximum likelihood estimation procedures become much more burdensome and complex. To ease the complexity, in this study, the modified estimating equations those have been given high attention and considerations in many researchers under semiparametric transformation model was proposed. The purpose of this article was to develop the modified estimating equation under flexible and general class of semiparametric transformation models for left-truncated and right censored survival data with time-varying covariates. Besides the commonly applied Cox proportional hazards model, such kind of problems can be also analyzed with a general class of semiparametric transformation models to estimate the effect of treatment given possibly time-varying covariates on the survival time. The consistency and asymptotic properties of the estimators were intuitively derived via the expectation-maximization (EM) algorithm. The characteristics of the estimators in the finite sample performance for the proposed model were illustrated via simulation studies and Stanford heart transplant real data examples. To sum up the study, the bias for covariates has been adjusted by estimating density function for the truncation time variable. Then the effect of possibly time-varying covariates was evaluated in some special semiparametric transformation models.

Keywords: EM algorithm, estimating equation, semiparametric transformation models, time-to-event outcomes, time varying covariate

Procedia PDF Downloads 125
130 Asymptotic Spectral Theory for Nonlinear Random Fields

Authors: Karima Kimouche

Abstract:

In this paper, we consider the asymptotic problems in spectral analysis of stationary causal random fields. We impose conditions only involving (conditional) moments, which are easily verifiable for a variety of nonlinear random fields. Limiting distributions of periodograms and smoothed periodogram spectral density estimates are obtained and applications to the spectral domain bootstrap are given.

Keywords: spatial nonlinear processes, spectral estimators, GMC condition, bootstrap method

Procedia PDF Downloads 423
129 Transition From Economic Growth-Energy Use to Green Growth-Green Energy Towards Environmental Quality: Evidence from Africa Using Econometric Approaches

Authors: Jackson Niyongabo

Abstract:

This study addresses a notable gap in the existing literature on the relationship between energy consumption, economic growth, and CO₂ emissions, particularly within the African context. While numerous studies have explored these dynamics globally and regionally across various development levels, few have delved into the nuances of regions and income levels specific to African countries. Furthermore, the evaluation of the interplay between green growth policies, green energy technologies, and their impact on environmental quality has been underexplored. This research aims to fill these gaps by conducting a comprehensive analysis of the transition from conventional economic growth and energy consumption to a paradigm of green growth coupled with green energy utilization across the African continent from 1980 to 2018. The study is structured into three main parts: an empirical examination of the long-term effects of energy intensity, renewable energy consumption, and economic growth on CO₂ emissions across diverse African regions and income levels; an estimation of the long-term impact of green growth and green energy use on CO₂ emissions for countries implementing green policies within Africa, as well as at regional and global levels; and a comparative analysis of the impact of green growth policies on environmental degradation before and after implementation. Employing advanced econometric methods and panel estimators, the study utilizes a testing framework, panel unit tests, and various estimators to derive meaningful insights. The anticipated results and conclusions will be elucidated through causality tests, impulse response, and variance decomposition analyses, contributing valuable knowledge to the discourse on sustainable development in the African context.

Keywords: economic growth, green growth, energy consumption, CO₂ emissions, econometric models, green energy

Procedia PDF Downloads 30
128 Estimating Estimators: An Empirical Comparison of Non-Invasive Analysis Methods

Authors: Yan Torres, Fernanda Simoes, Francisco Petrucci-Fonseca, Freddie-Jeanne Richard

Abstract:

The non-invasive samples are an alternative of collecting genetic samples directly. Non-invasive samples are collected without the manipulation of the animal (e.g., scats, feathers and hairs). Nevertheless, the use of non-invasive samples has some limitations. The main issue is degraded DNA, leading to poorer extraction efficiency and genotyping. Those errors delayed for some years a widespread use of non-invasive genetic information. Possibilities to limit genotyping errors can be done using analysis methods that can assimilate the errors and singularities of non-invasive samples. Genotype matching and population estimation algorithms can be highlighted as important analysis tools that have been adapted to deal with those errors. Although, this recent development of analysis methods there is still a lack of empirical performance comparison of them. A comparison of methods with dataset different in size and structure can be useful for future studies since non-invasive samples are a powerful tool for getting information specially for endangered and rare populations. To compare the analysis methods, four different datasets used were obtained from the Dryad digital repository were used. Three different matching algorithms (Cervus, Colony and Error Tolerant Likelihood Matching - ETLM) are used for matching genotypes and two different ones for population estimation (Capwire and BayesN). The three matching algorithms showed different patterns of results. The ETLM produced less number of unique individuals and recaptures. A similarity in the matched genotypes between Colony and Cervus was observed. That is not a surprise since the similarity between those methods on the likelihood pairwise and clustering algorithms. The matching of ETLM showed almost no similarity with the genotypes that were matched with the other methods. The different cluster algorithm system and error model of ETLM seems to lead to a more criterious selection, although the processing time and interface friendly of ETLM were the worst between the compared methods. The population estimators performed differently regarding the datasets. There was a consensus between the different estimators only for the one dataset. The BayesN showed higher and lower estimations when compared with Capwire. The BayesN does not consider the total number of recaptures like Capwire only the recapture events. So, this makes the estimator sensitive to data heterogeneity. Heterogeneity in the sense means different capture rates between individuals. In those examples, the tolerance for homogeneity seems to be crucial for BayesN work properly. Both methods are user-friendly and have reasonable processing time. An amplified analysis with simulated genotype data can clarify the sensibility of the algorithms. The present comparison of the matching methods indicates that Colony seems to be more appropriated for general use considering a time/interface/robustness balance. The heterogeneity of the recaptures affected strongly the BayesN estimations, leading to over and underestimations population numbers. Capwire is then advisable to general use since it performs better in a wide range of situations.

Keywords: algorithms, genetics, matching, population

Procedia PDF Downloads 114
127 Quantitative Ranking Evaluation of Wine Quality

Authors: A. Brunel, A. Kernevez, F. Leclere, J. Trenteseaux

Abstract:

Today, wine quality is only evaluated by wine experts with their own different personal tastes, even if they may agree on some common features. So producers do not have any unbiased way to independently assess the quality of their products. A tool is here proposed to evaluate wine quality by an objective ranking based upon the variables entering wine elaboration, and analysed through principal component analysis (PCA) method. Actual climatic data are compared by measuring the relative distance between each considered wine, out of which the general ranking is performed.

Keywords: wine, grape, weather conditions, rating, climate, principal component analysis, metric analysis

Procedia PDF Downloads 284