Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 862

Search results for: rank estimates

862 Robust Variogram Fitting Using Non-Linear Rank-Based Estimators

Authors: Hazem M. Al-Mofleh, John E. Daniels, Joseph W. McKean

Abstract:

In this paper numerous robust fitting procedures are considered in estimating spatial variograms. In spatial statistics, the conventional variogram fitting procedure (non-linear weighted least squares) suffers from the same outlier problem that has plagued this method from its inception. Even a 3-parameter model, like the variogram, can be adversely affected by a single outlier. This paper uses the Hogg-Type adaptive procedures to select an optimal score function for a rank-based estimator for these non-linear models. Numeric examples and simulation studies will demonstrate the robustness, utility, efficiency, and validity of these estimates.

Keywords: asymptotic relative efficiency, non-linear rank-based, rank estimates, variogram

Procedia PDF Downloads 285
861 Some Results on the Generalized Higher Rank Numerical Ranges

Authors: Mohsen Zahraei

Abstract:

‎In this paper, ‎the notion of ‎rank-k numerical range of rectangular complex matrix polynomials‎ ‎are introduced. ‎Some algebraic and geometrical properties are investigated. ‎Moreover, ‎for ε>0 the notion of Birkhoff-James approximate orthogonality sets for ε-higher ‎rank numerical ranges of rectangular matrix polynomials is also introduced and studied. ‎The proposed definitions yield a natural generalization of the standard higher rank numerical ranges.

Keywords: ‎‎Rank-k numerical range‎, ‎isometry‎, ‎numerical range‎, ‎rectangular matrix polynomials

Procedia PDF Downloads 326
860 Choosing between the Regression Correlation, the Rank Correlation, and the Correlation Curve

Authors: Roger L. Goodwin

Abstract:

This paper presents a rank correlation curve. The traditional correlation coefficient is valid for both continuous variables and for integer variables using rank statistics. Since the correlation coefficient has already been established in rank statistics by Spearman, such a calculation can be extended to the correlation curve. This paper presents two survey questions. The survey collected non-continuous variables. We will show weak to moderate correlation. Obviously, one question has a negative effect on the other. A review of the qualitative literature can answer which question and why. The rank correlation curve shows which collection of responses has a positive slope and which collection of responses has a negative slope. Such information is unavailable from the flat, "first-glance" correlation statistics.

Keywords: Bayesian estimation, regression model, rank statistics, correlation, correlation curve

Procedia PDF Downloads 326
859 The Effect of Non-Normality on CB-SEM and PLS-SEM Path Estimates

Authors: Z. Jannoo, B. W. Yap, N. Auchoybur, M. A. Lazim

Abstract:

The two common approaches to Structural Equation Modeling (SEM) are the Covariance-Based SEM (CB-SEM) and Partial Least Squares SEM (PLS-SEM). There is much debate on the performance of CB-SEM and PLS-SEM for small sample size and when distributions are non-normal. This study evaluates the performance of CB-SEM and PLS-SEM under normality and non-normality conditions via a simulation. Monte Carlo Simulation in R programming language was employed to generate data based on the theoretical model with one endogenous and four exogenous variables. Each latent variable has three indicators. For normal distributions, CB-SEM estimates were found to be inaccurate for small sample size while PLS-SEM could produce the path estimates. Meanwhile, for a larger sample size, CB-SEM estimates have lower variability compared to PLS-SEM. Under non-normality, CB-SEM path estimates were inaccurate for small sample size. However, CB-SEM estimates are more accurate than those of PLS-SEM for sample size of 50 and above. The PLS-SEM estimates are not accurate unless sample size is very large.

Keywords: CB-SEM, Monte Carlo simulation, normality conditions, non-normality, PLS-SEM

Procedia PDF Downloads 255
858 Developing HRCT Criterion to Predict the Risk of Pulmonary Tuberculosis

Authors: Vandna Raghuvanshi, Vikrant Thakur, Anupam Jhobta

Abstract:

Objective: To design HRCT criterion to forecast the threat of pulmonary tuberculosis. Material and methods: This was a prospective study of 69 patients with clinical suspicion of pulmonary tuberculosis. We studied their medical characteristics, numerous separate HRCT-results, and a combination of HRCT findings to foresee the danger for PTB by utilizing univariate and multivariate investigation. Temporary HRCT diagnostic criteria were planned in view of these outcomes to find out the risk of PTB and tested these criteria on our patients. Results: The results of HRCT chest were analyzed, and Rank was given from 1 to 4 according to the HRCT chest findings. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated. Rank 1: Highly suspected PTB. Rank 2: Probable PTB Rank 3: Nonspecific or difficult to differentiate from other diseases Rank 4: Other suspected diseases • Rank 1 (Highly suspected TB) was present in 22 (31.9%) patients, all of them finally diagnosed to have pulmonary tuberculosis. The sensitivity, specificity, and negative likelihood ratio for RANK 1 on HRCT chest was 53.6%, 100%, and 0.43, respectively. • Rank 2 (Probable TB) was present in 13 patients, out of which 12 were tubercular, and 1 was non-tubercular. • The sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio of the combination of Rank 1 and Rank 2 was 82.9%, 96.4%, 23.22, and 0.18, respectively. • Rank 3 (Non-specific TB) was present in 25 patients, and out of these, 7 were tubercular, and 18 were non-tubercular. • When all these 3 ranks were considered together, the sensitivity approached 100% however, the specificity reduced to 35.7%. The positive likelihood ratio and negative likelihood ratio were 1.56 and 0, respectively. • Rank 4 (Other specific findings) was given to 9 patients, and all of these were non-tubercular. Conclusion: HRCT is useful in selecting individuals with greater chances of pulmonary tuberculosis.

Keywords: pulmonary, tuberculosis, multivariate, HRCT

Procedia PDF Downloads 41
857 Critical Accounting Estimates and Transparency in Financial Reporting: An Observation Financial Reporting under US GAAP

Authors: Ahmed Shaik

Abstract:

Estimates are very critical in accounting and Financial Reporting cannot be complete without these estimates. There is a long list of accounting estimates that are required to be made to compute Net Income and to determine the value of assets and liabilities. To name a few, valuation of inventory, depreciation, valuation of goodwill, provision for bad debts and estimated warranties, etc. require the use of different valuation models and forecasts. Different business entities under the same industry may use different approaches to measure the value of financial items being reported in Income Statement and Balance Sheet. The disclosure notes do not provide enough details of the approach used by a business entity to arrive at the value of a financial item. Lack of details in the disclosure notes makes it difficult to compare the financial performance of one business entity with the other in the same industry. This paper is an attempt to identify the lack of enough information about accounting estimates in disclosure notes, the impact of the absence of details of accounting estimates on the comparability of financial data and financial analysis. An attempt is made to suggest the detailed disclosure while taking care of the cost and benefit of making such disclosure.

Keywords: accounting estimates, disclosure notes, financial reporting, transparency

Procedia PDF Downloads 88
856 Rank of Semigroup: Generating Sets and Cases Revealing Limitations of the Concept of Independence

Authors: Zsolt Lipcsey, Sampson Marshal Imeh

Abstract:

We investigate a certain characterisation for rank of a semigroup by Howie and Ribeiro (1999), to ascertain the relevance of the concept of independence. There are cases where the concept of independence fails to be useful for this purpose. One would expect the basic element to be the maximal independent subset of a given semigroup. However, we construct examples for semigroups where finite basis exist and the basis is larger than the number of independent elements.

Keywords: generating sets, independent set, rank, cyclic semigroup, basis, commutative

Procedia PDF Downloads 61
855 Trend Detection Using Community Rank and Hawkes Process

Authors: Shashank Bhatnagar, W. Wilfred Godfrey

Abstract:

We develop in this paper, an approach to find the trendy topic, which not only considers the user-topic interaction but also considers the community, in which user belongs. This method modifies the previous approach of user-topic interaction to user-community-topic interaction with better speed-up in the range of [1.1-3]. We assume that trend detection in a social network is dependent on two things. The one is, broadcast of messages in social network governed by self-exciting point process, namely called Hawkes process and the second is, Community Rank. The influencer node links to others in the community and decides the community rank based on its PageRank and the number of users links to that community. The community rank decides the influence of one community over the other. Hence, the Hawkes process with the kernel of user-community-topic decides the trendy topic disseminated into the social network.

Keywords: community detection, community rank, Hawkes process, influencer node, pagerank, trend detection

Procedia PDF Downloads 270
854 Developing a Translator Career Path: Based on the Dreyfus Model of Skills Acquisition

Authors: Noha A. Alowedi

Abstract:

This paper proposes a Translator Career Path (TCP) which is based on the Dreyfus Model of Skills Acquisition as the conceptual framework. In this qualitative study, the methodology to collect and analyze the data takes an inductive approach that draws upon the literature to form the criteria for the different steps in the TCP. This path is based on descriptors of expert translator performance and best employees’ practice documented in the literature. Each translator skill will be graded as novice, advanced beginner, competent, proficient, and expert. Consequently, five levels of translator performance are identified in the TCP as five ranks. The first rank is the intern translator, which is equivalent to the novice level; the second rank is the assistant translator, which is equivalent to the advanced beginner level; the third rank is the associate translator, which is equivalent to the competent level; the fourth rank is the translator, which is equivalent to the proficient level; finally, the fifth rank is the expert translator, which is equivalent to the expert level. The main function of this career path is to guide the processes of translator development in translation organizations. Although it is designed primarily for the need of in-house translators’ supervisors, the TCP can be used in academic settings for translation trainers and teachers.

Keywords: Dreyfus model, translation organization, translator career path, translator development, translator evaluation, translator promotion

Procedia PDF Downloads 249
853 Sparse Unmixing of Hyperspectral Data by Exploiting Joint-Sparsity and Rank-Deficiency

Authors: Fanqiang Kong, Chending Bian

Abstract:

In this work, we exploit two assumed properties of the abundances of the observed signatures (endmembers) in order to reconstruct the abundances from hyperspectral data. Joint-sparsity is the first property of the abundances, which assumes the adjacent pixels can be expressed as different linear combinations of same materials. The second property is rank-deficiency where the number of endmembers participating in hyperspectral data is very small compared with the dimensionality of spectral library, which means that the abundances matrix of the endmembers is a low-rank matrix. These assumptions lead to an optimization problem for the sparse unmixing model that requires minimizing a combined l2,p-norm and nuclear norm. We propose a variable splitting and augmented Lagrangian algorithm to solve the optimization problem. Experimental evaluation carried out on synthetic and real hyperspectral data shows that the proposed method outperforms the state-of-the-art algorithms with a better spectral unmixing accuracy.

Keywords: hyperspectral unmixing, joint-sparse, low-rank representation, abundance estimation

Procedia PDF Downloads 108
852 Bayesian Network and Feature Selection for Rank Deficient Inverse Problem

Authors: Kyugneun Lee, Ikjin Lee

Abstract:

Parameter estimation with inverse problem often suffers from unfavorable conditions in the real world. Useless data and many input parameters make the problem complicated or insoluble. Data refinement and reformulation of the problem can solve that kind of difficulties. In this research, a method to solve the rank deficient inverse problem is suggested. A multi-physics system which has rank deficiency caused by response correlation is treated. Impeditive information is removed and the problem is reformulated to sequential estimations using Bayesian network (BN) and subset groups. At first, subset grouping of the responses is performed. Feature selection with singular value decomposition (SVD) is used for the grouping. Next, BN inference is used for sequential conditional estimation according to the group hierarchy. Directed acyclic graph (DAG) structure is organized to maximize the estimation ability. Variance ratio of response to noise is used to pairing the estimable parameters by each response.

Keywords: Bayesian network, feature selection, rank deficiency, statistical inverse analysis

Procedia PDF Downloads 186
851 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates. On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: aggregate data, combined-level data, individual patient data, meta-analysis

Procedia PDF Downloads 285
850 Robust Inference with a Skew T Distribution

Authors: M. Qamarul Islam, Ergun Dogan, Mehmet Yazici

Abstract:

There is a growing body of evidence that non-normal data is more prevalent in nature than the normal one. Examples can be quoted from, but not restricted to, the areas of Economics, Finance and Actuarial Science. The non-normality considered here is expressed in terms of fat-tailedness and asymmetry of the relevant distribution. In this study a skew t distribution that can be used to model a data that exhibit inherent non-normal behavior is considered. This distribution has tails fatter than a normal distribution and it also exhibits skewness. Although maximum likelihood estimates can be obtained by solving iteratively the likelihood equations that are non-linear in form, this can be problematic in terms of convergence and in many other respects as well. Therefore, it is preferred to use the method of modified maximum likelihood in which the likelihood estimates are derived by expressing the intractable non-linear likelihood equations in terms of standardized ordered variates and replacing the intractable terms by their linear approximations obtained from the first two terms of a Taylor series expansion about the quantiles of the distribution. These estimates, called modified maximum likelihood estimates, are obtained in closed form. Hence, they are easy to compute and to manipulate analytically. In fact the modified maximum likelihood estimates are equivalent to maximum likelihood estimates, asymptotically. Even in small samples the modified maximum likelihood estimates are found to be approximately the same as maximum likelihood estimates that are obtained iteratively. It is shown in this study that the modified maximum likelihood estimates are not only unbiased but substantially more efficient than the commonly used moment estimates or the least square estimates that are known to be biased and inefficient in such cases. Furthermore, in conventional regression analysis, it is assumed that the error terms are distributed normally and, hence, the well-known least square method is considered to be a suitable and preferred method for making the relevant statistical inferences. However, a number of empirical researches have shown that non-normal errors are more prevalent. Even transforming and/or filtering techniques may not produce normally distributed residuals. Here, a study is done for multiple linear regression models with random error having non-normal pattern. Through an extensive simulation it is shown that the modified maximum likelihood estimates of regression parameters are plausibly robust to the distributional assumptions and to various data anomalies as compared to the widely used least square estimates. Relevant tests of hypothesis are developed and are explored for desirable properties in terms of their size and power. The tests based upon modified maximum likelihood estimates are found to be substantially more powerful than the tests based upon least square estimates. Several examples are provided from the areas of Economics and Finance where such distributions are interpretable in terms of efficient market hypothesis with respect to asset pricing, portfolio selection, risk measurement and capital allocation, etc.

Keywords: least square estimates, linear regression, maximum likelihood estimates, modified maximum likelihood method, non-normality, robustness

Procedia PDF Downloads 319
849 Estimating Current Suicide Rates Using Google Trends

Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis

Abstract:

Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.

Keywords: nowcasting, search data, Google Trends, official statistics

Procedia PDF Downloads 254
848 Assuming the Decision of Having One (More) Child: The New Dimensions of the Post Communist Romanian Family

Authors: Horea-Serban Raluca-Ioana, Istrate Marinela

Abstract:

The first part of the paper analyzes the dynamics of the total fertility rate both at the national and regional level, pointing out the regional disparities in the distribution of this indicator. At the same time, we also focus on the collapse of the number of live births, on the changes in the fertility rate by birth rank, as well as on the failure of acquiring the desired number of children. The second part of the study centres upon a survey applied to urban families with 3 and more than 3 offspring. The preliminary analysis highlights the fact that an increased fertility (more than 3rd rank) is triggered by the parents’ above the average material condition and superior education. The current situation of Romania, which is still passing through a period of relatively rapid demographic changes, marked by numerous convulsions, requires a new approach, in compliance with the recent interpretations appropriate to a new post-transitional demographic regime.

Keywords: fertility rate, family size intention, third birth rank, regional disparities

Procedia PDF Downloads 232
847 Multidirectional Product Support System for Decision Making in Textile Industry Using Collaborative Filtering Methods

Authors: A. Senthil Kumar, V. Murali Bhaskaran

Abstract:

In the information technology ground, people are using various tools and software for their official use and personal reasons. Nowadays, people are worrying to choose data accessing and extraction tools at the time of buying and selling their products. In addition, worry about various quality factors such as price, durability, color, size, and availability of the product. The main purpose of the research study is to find solutions to these unsolved existing problems. The proposed algorithm is a Multidirectional Rank Prediction (MDRP) decision making algorithm in order to take an effective strategic decision at all the levels of data extraction, uses a real time textile dataset and analyzes the results. Finally, the results are obtained and compared with the existing measurement methods such as PCC, SLCF, and VSS. The result accuracy is higher than the existing rank prediction methods.

Keywords: Knowledge Discovery in Database (KDD), Multidirectional Rank Prediction (MDRP), Pearson’s Correlation Coefficient (PCC), VSS (Vector Space Similarity)

Procedia PDF Downloads 199
846 Application of Regularized Low-Rank Matrix Factorization in Personalized Targeting

Authors: Kourosh Modarresi

Abstract:

The Netflix problem has brought the topic of “Recommendation Systems” into the mainstream of computer science, mathematics, and statistics. Though much progress has been made, the available algorithms do not obtain satisfactory results. The success of these algorithms is rarely above 5%. This work is based on the belief that the main challenge is to come up with “scalable personalization” models. This paper uses an adaptive regularization of inverse singular value decomposition (SVD) that applies adaptive penalization on the singular vectors. The results show far better matching for recommender systems when compared to the ones from the state of the art models in the industry.

Keywords: convex optimization, LASSO, regression, recommender systems, singular value decomposition, low rank approximation

Procedia PDF Downloads 321
845 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 304
844 Polynomially Adjusted Bivariate Density Estimates Based on the Saddlepoint Approximation

Authors: S. B. Provost, Susan Sheng

Abstract:

An alternative bivariate density estimation methodology is introduced in this presentation. The proposed approach involves estimating the density function associated with the marginal distribution of each of the two variables by means of the saddlepoint approximation technique and applying a bivariate polynomial adjustment to the product of these density estimates. Since the saddlepoint approximation is utilized in the context of density estimation, such estimates are determined from empirical cumulant-generating functions. In the univariate case, the saddlepoint density estimate is itself adjusted by a polynomial. Given a set of observations, the coefficients of the polynomial adjustments are obtained from the sample moments. Several illustrative applications of the proposed methodology shall be presented. Since this approach relies essentially on a determinate number of sample moments, it is particularly well suited for modeling massive data sets.

Keywords: density estimation, empirical cumulant-generating function, moments, saddlepoint approximation

Procedia PDF Downloads 137
843 Model Order Reduction of Continuous LTI Large Descriptor System Using LRCF-ADI and Square Root Balanced Truncation

Authors: Mohammad Sahadet Hossain, Shamsil Arifeen, Mehrab Hossian Likhon

Abstract:

In this paper, we analyze a linear time invariant (LTI) descriptor system of large dimension. Since these systems are difficult to simulate, compute and store, we attempt to reduce this large system using Low Rank Cholesky Factorized Alternating Directions Implicit (LRCF-ADI) iteration followed by Square Root Balanced Truncation. LRCF-ADI solves the dual Lyapunov equations of the large system and gives low-rank Cholesky factors of the gramians as the solution. Using these cholesky factors, we compute the Hankel singular values via singular value decomposition. Later, implementing square root balanced truncation, the reduced system is obtained. The bode plots of original and lower order systems are used to show that the magnitude and phase responses are same for both the systems.

Keywords: low-rank cholesky factor alternating directions implicit iteration, LTI Descriptor system, Lyapunov equations, Square-root balanced truncation

Procedia PDF Downloads 314
842 Evaluation of the Impact of Information and Communications Technology (ICT) on the Accuracy of Preliminary Cost Estimates of Building Projects in Nigeria

Authors: Nofiu A. Musa, Olubola Babalola

Abstract:

The study explored the effect of ICT on the accuracy of Preliminary Cost Estimates (PCEs) prepared by quantity surveying consulting firms in Nigeria for building projects, with a view to determining the desirability of the adoption and use of the technological innovation for preliminary estimating. Thus, data pertinent to the study were obtained through questionnaire survey conducted on a sample of one hundred and eight (108) quantity surveying firms selected from the list of registered firms compiled by the Nigerian Institute of Quantity Surveyors (NIQS), Lagos State Chapter through systematic random sampling. The data obtained were analyzed with SPSS version 17 using student’s t-tests at 5% significance level. The results obtained revealed that the mean bias and co-efficient of variation of the PCEs of the firms are significantly less at post ICT adoption period than the pre ICT adoption period, F < 0.05 in each case. The paper concluded that the adoption and use of the Technological Innovation (ICT) has significantly improved the accuracy of the Preliminary Cost Estimates (PCEs) of building projects, hence, it is desirable.

Keywords: accepted tender price, accuracy, bias, building projects, consistency, information and communications technology, preliminary cost estimates

Procedia PDF Downloads 334
841 A Semiparametric Approach to Estimate the Mode of Continuous Multivariate Data

Authors: Tiee-Jian Wu, Chih-Yuan Hsu

Abstract:

Mode estimation is an important task, because it has applications to data from a wide variety of sources. We propose a semi-parametric approach to estimate the mode of an unknown continuous multivariate density function. Our approach is based on a weighted average of a parametric density estimate using the Box-Cox transform and a non-parametric kernel density estimate. Our semi-parametric mode estimate improves both the parametric- and non-parametric- mode estimates. Specifically, our mode estimate solves the non-consistency problem of parametric mode estimates (at large sample sizes) and reduces the variability of non-parametric mode estimates (at small sample sizes). The performance of our method at practical sample sizes is demonstrated by simulation examples and two real examples from the fields of climatology and image recognition.

Keywords: Box-Cox transform, density estimation, mode seeking, semiparametric method

Procedia PDF Downloads 162
840 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble

Procedia PDF Downloads 30
839 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 317
838 Investigating the Impact of Task Demand and Duration on Passage of Time Judgements and Duration Estimates

Authors: Jesika A. Walker, Mohammed Aswad, Guy Lacroix, Denis Cousineau

Abstract:

There is a fundamental disconnect between the experience of time passing and the chronometric units by which time is quantified. Specifically, there appears to be no relationship between the passage of time judgments (PoTJs) and verbal duration estimates at short durations (e.g., < 2000 milliseconds). When a duration is longer than several minutes, however, evidence suggests that a slower feeling of time passing is predictive of overestimation. Might the length of a task moderate the relation between PoTJs and duration estimates? Similarly, the estimation paradigm (prospective vs. retrospective) and the mental effort demanded of a task (task demand) have both been found to influence duration estimates. However, only a handful of experiments have investigated these effects for tasks of long durations, and the results have been mixed. Thus, might the length of a task also moderate the effects of the estimation paradigm and task demand on duration estimates? To investigate these questions, 273 participants performed either an easy or difficult visual and memory search task for either eight or 58 minutes, under prospective or retrospective instructions. Afterward, participants provided a duration estimate in minutes, followed by a PoTJ on a Likert scale (1 = very slow, 7 = very fast). A 2 (prospective vs. retrospective) × 2 (eight minutes vs. 58 minutes) × 2 (high vs. low difficulty) between-subjects ANOVA revealed a two-way interaction between task demand and task duration on PoTJs, p = .02. Specifically, time felt faster in the more challenging task, but only in the eight-minute condition, p < .01. Duration estimates were transformed into RATIOs (estimate/actual duration) to standardize estimates across durations. An ANOVA revealed a two-way interaction between estimation paradigm and task duration, p = .03. Specifically, participants overestimated the task more if they were given prospective instructions, but only in the eight-minute task. Surprisingly, there was no effect of task difficulty on duration estimates. Thus, the demands of a task may influence ‘feeling of time’ and ‘estimation time’ differently, contributing to the existing theory that these two forms of time judgement rely on separate underlying cognitive mechanisms. Finally, a significant main effect of task duration was found for both PoTJs and duration estimates (ps < .001). Participants underestimated the 58-minute task (m = 42.5 minutes) and overestimated the eight-minute task (m = 10.7 minutes). Yet, they reported the 58-minute task as passing significantly slower on a Likert scale (m = 2.5) compared to the eight-minute task (m = 4.1). In fact, a significant correlation was found between PoTJ and duration estimation (r = .27, p <.001). This experiment thus provides evidence for a compensatory effect at longer durations, in which people underestimate a ‘slow feeling condition and overestimate a ‘fast feeling condition. The results are discussed in relation to heuristics that might alter the relationship between these two variables when conditions range from several minutes up to almost an hour.

Keywords: duration estimates, long durations, passage of time judgements, task demands

Procedia PDF Downloads 21
837 System Identification in Presence of Outliers

Authors: Chao Yu, Qing-Guo Wang, Dan Zhang

Abstract:

The outlier detection problem for dynamic systems is formulated as a matrix decomposition problem with low-rank, sparse matrices and further recast as a semidefinite programming (SDP) problem. A fast algorithm is presented to solve the resulting problem while keeping the solution matrix structure and it can greatly reduce the computational cost over the standard interior-point method. The computational burden is further reduced by proper construction of subsets of the raw data without violating low rank property of the involved matrix. The proposed method can make exact detection of outliers in case of no or little noise in output observations. In case of significant noise, a novel approach based on under-sampling with averaging is developed to denoise while retaining the saliency of outliers and so-filtered data enables successful outlier detection with the proposed method while the existing filtering methods fail. Use of recovered “clean” data from the proposed method can give much better parameter estimation compared with that based on the raw data.

Keywords: outlier detection, system identification, matrix decomposition, low-rank matrix, sparsity, semidefinite programming, interior-point methods, denoising

Procedia PDF Downloads 205
836 A Stepwise Approach to Automate the Search for Optimal Parameters in Seasonal ARIMA Models

Authors: Manisha Mukherjee, Diptarka Saha

Abstract:

Reliable forecasts of univariate time series data are often necessary for several contexts. ARIMA models are quite popular among practitioners in this regard. Hence, choosing correct parameter values for ARIMA is a challenging yet imperative task. Thus, a stepwise algorithm is introduced to provide automatic and robust estimates for parameters (p; d; q)(P; D; Q) used in seasonal ARIMA models. This process is focused on improvising the overall quality of the estimates, and it alleviates the problems induced due to the unidimensional nature of the methods that are currently used such as auto.arima. The fast and automated search of parameter space also ensures reliable estimates of the parameters that possess several desirable qualities, consequently, resulting in higher test accuracy especially in the cases of noisy data. After vigorous testing on real as well as simulated data, the algorithm doesn’t only perform better than current state-of-the-art methods, it also completely obviates the need for human intervention due to its automated nature.

Keywords: time series, ARIMA, auto.arima, ARIMA parameters, forecast, R function

Procedia PDF Downloads 51
835 The Sequential Estimation of the Seismoacoustic Source Energy in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

The practical efficient approach is suggested for estimation of the seismoacoustic sources energy in C-OTDR monitoring systems. This approach represents the sequential plan for confidence estimation both the seismoacoustic sources energy, as well the absorption coefficient of the soil. The sequential plan delivers the non-asymptotic guaranteed accuracy of obtained estimates in the form of non-asymptotic confidence regions with prescribed sizes. These confidence regions are valid for a finite sample size when the distributions of the observations are unknown. Thus, suggested estimates are non-asymptotic and nonparametric, and also these estimates guarantee the prescribed estimation accuracy in the form of the prior prescribed size of confidence regions, and prescribed confidence coefficient value.

Keywords: nonparametric estimation, sequential confidence estimation, multichannel monitoring systems, C-OTDR-system, non-lineary regression

Procedia PDF Downloads 245
834 Use of Biomass as Co-Fuel in Briquetting of Low-Rank Coal: Strengthen the Energy Supply and Save the Environment

Authors: Mahidin, Yanna Syamsuddin, Samsul Rizal

Abstract:

In order to fulfill world energy demand, several efforts have been done to look for new and renewable energy candidates to substitute oil and gas. Biomass is one of new and renewable energy sources, which is abundant in Indonesia. Palm kernel shell is a kind of biomass discharge from palm oil industries as a waste. On the other hand, Jatropha curcas that is easy to grow in Indonesia is also a typical energy source either for bio-diesel or biomass. In this study, biomass was used as co-fuel in briquetting of low-rank coal to suppress the release of emission (such as CO, NOx and SOx) during coal combustion. Desulfurizer, CaO-base, was also added to ensure the SOx capture is effectively occurred. Ratio of coal to palm kernel shell (w/w) in the bio-briquette were 50:50, 60:40, 70:30, 80:20 and 90:10, while ratio of calcium to sulfur (Ca/S) in mole/mole were 1:1; 1.25:1; 1.5:1; 1.75:1 and 2:1. The bio-briquette then subjected to physical characterization and combustion test. The results show that the maximum weight loss in the durability measurement was ±6%. In addition, the highest stove efficiency for each desulfurizer was observed at the coal/PKS ratio of 90:10 and Ca/S ratio of 1:1 (except for the scallop shell desulfurizer that appeared at two Ca/S ratios; 1.25:1 and 1.5:1, respectively), i.e. 13.8% for the lime; 15.86% for the oyster shell; 14.54% for the scallop shell and 15.84% for the green mussel shell desulfurizers.

Keywords: biomass, low-rank coal, bio-briquette, new and renewable energy, palm kernel shell

Procedia PDF Downloads 338
833 First Rank Symptoms in Mania: An Indistinct Diagnostic Strand

Authors: Afshan Channa, Sameeha Aleem, Harim Mohsin

Abstract:

First rank symptoms (FRS) are considered to be pathognomic for Schizophrenia. However, FRS is not a distinctive feature of Schizophrenia. It has also been noticed in affective disorder, albeit not inclusive in diagnostic criteria. The presence of FRS in Mania leads to misdiagnosis of psychotic illness, further complicating the management and delay of appropriate treatment. FRS in Mania is associated with poor clinical and functional outcome. Its existence in the first episode of bipolar disorder may be a predictor of poor short-term outcome and decompensating course of illness. FRS in Mania is studied in west. However, the cultural divergence and detriments make it pertinent to study the frequency of FRS in affective disorder independently in Pakistan. Objective: The frequency of first rank symptoms in manic patients, who were under treatment at psychiatric services of tertiary care hospital. Method: The cross sectional study was done at psychiatric services of Aga Khan University Hospital, Karachi, Pakistan. One hundred and twenty manic patients were recruited from November 2014 to May 2015. The patients who were unable to comprehend Urdu or had comorbid psychiatric or organic disorder were excluded. FRS was assessed by administration of validated Urdu version of Present State Examination (PSE) tool. Result: The mean age of the patients was 37.62 + 12.51. The mean number of previous manic episode was 2.17 + 2.23. 11.2% males and 30.6% females had FRS. This association of first rank symptoms with gender in patients of mania was found to be significant with a p-value of 0.008. All-inclusive, 19.2% exhibited FRS in their course of illness. 43.5% had thought broadcasting, made feeling, impulses, action and somatic passivity. 39.1% had thought insertion, 30.4% had auditory perceptual distortion, and 17.4% had thought withdrawal. However, none displayed delusional perception. Conclusion: The study confirms the presence of FRS in mania in both male and female, irrespective of the duration of current manic illness or previous number of manic episodes. A substantial difference was established between both the genders. Being married had no protective effect on the presence of FRS.

Keywords: first rank symptoms, Mania, psychosis, present state examination

Procedia PDF Downloads 285