Search results for: distributional%20semantics.

11 The Predictability and Abstractness of Language: A Study in Understanding and Usage of the English Language through Probabilistic Modeling and Frequency

Authors: Revanth Sai Kosaraju, Michael Ramscar, Melody Dye

Abstract:

Accounts of language acquisition differ significantly in their treatment of the role of prediction in language learning. In particular, nativist accounts posit that probabilistic learning about words and word sequences has little to do with how children come to use language. The accuracy of this claim was examined by testing whether distributional probabilities and frequency contributed to how well 3-4 year olds repeat simple word chunks. Corresponding chunks were the same length, expressed similar content, and were all grammatically acceptable, yet the results of the study showed marked differences in performance when overall distributional frequency varied. It was found that a distributional model of language predicted the empirical findings better than a number of other models, replicating earlier findings and showing that children attend to distributional probabilities in an adult corpus. This suggested that language is more prediction-and-error based, rather than on abstract rules which nativist camps suggest.

Keywords: Abstractness, child psychology, language acquisition, prediction and error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2083

10 Contextual Distribution for Textual Alignment

Authors: Yuri Bizzoni, Marianne Reboul

Abstract:

Our program compares French and Italian translations of Homer’s Odyssey, from the XVIth to the XXth century. We focus on the third point, showing how distributional semantics systems can be used both to improve alignment between different French translations as well as between the Greek text and a French translation. Although we focus on French examples, the techniques we display are completely language independent.

Keywords: Translation studies, machine translation, computational linguistics, distributional semantics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1022

9 Characteristic Function in Estimation of Probability Distribution Moments

Authors: Vladimir S. Timofeev

Abstract:

In this article the problem of distributional moments estimation is considered. The new approach of moments estimation based on usage of the characteristic function is proposed. By statistical simulation technique author shows that new approach has some robust properties. For calculation of the derivatives of characteristic function there is used numerical differentiation. Obtained results confirmed that author’s idea has a certain working efficiency and it can be recommended for any statistical applications.

Keywords: Characteristic function, distributional moments, robustness, outlier, statistical estimation problem, statistical simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2249

8 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802

7 Some Remarks About Riemann-Liouville and Caputo Impulsive Fractional Calculus

Authors: M. De la Sen

Abstract:

This paper establishes some closed formulas for Riemann- Liouville impulsive fractional integral calculus and also for Riemann- Liouville and Caputo impulsive fractional derivatives.

Keywords: Rimann- Liouville fractional calculus, Caputofractional derivative, Dirac delta, Distributional derivatives, Highorderdistributional derivatives.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1417

6 Distributional Impacts of Changes in Value Added Tax Rates in the Czech Republic

Authors: Ondřej Bayer

Abstract:

The paper evaluates the ongoing reform of VAT in the Czech Republic in terms of impacts on individual households. The main objective is to analyse the impact of given changes on individual households. The adopted method is based on the data related to household consumption by individual household quintiles; obtained data are subjected to micro-simulation examining. Results are discussed in terms of vertical tax justice. Results of the analysis reveal that VAT behaves regressively and a sole consolidation of rates at a higher level only increases the regression of this tax in the Czech Republic.

Keywords: Consolidation of rates, household quintiles, tax impact, VAT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846

5 A Methodology for Characterising the Tail Behaviour of a Distribution

Authors: Serge Provost, Yishan Zang

Abstract:

Following a review of various approaches that are utilized for classifying the tail behavior of a distribution, an easily implementable methodology that relies on an arctangent transformation is presented. The classification criterion is actually based on the difference between two specific quantiles of the transformed distribution. The resulting categories enable one to classify distributional tails as distinctly short, short, nearly medium, medium, extended medium and somewhat long, providing that at least two moments exist. Distributions possessing a single moment are said to be long tailed while those failing to have any finite moments are classified as having an extremely long tail. Several illustrative examples will be presented.

Keywords: Arctangent transformation, change of variables, heavy-tailed distributions, tail classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 671

4 On the Bootstrap P-Value Method in Identifying out of Control Signals in Multivariate Control Chart

Authors: O. Ikpotokin

Abstract:

In any production process, every product is aimed to attain a certain standard, but the presence of assignable cause of variability affects our process, thereby leading to low quality of product. The ability to identify and remove this type of variability reduces its overall effect, thereby improving the quality of the product. In case of a univariate control chart signal, it is easy to detect the problem and give a solution since it is related to a single quality characteristic. However, the problems involved in the use of multivariate control chart are the violation of multivariate normal assumption and the difficulty in identifying the quality characteristic(s) that resulted in the out of control signals. The purpose of this paper is to examine the use of non-parametric control chart (the bootstrap approach) for obtaining control limit to overcome the problem of multivariate distributional assumption and the p-value method for detecting out of control signals. Results from a performance study show that the proposed bootstrap method enables the setting of control limit that can enhance the detection of out of control signals when compared, while the p-value method also enhanced in identifying out of control variables.

Keywords: Bootstrap control limit, p-value method, out-of-control signals, p-value, quality characteristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1004

3 Distributional Effects of Tax and Benefit Reforms in the Czech Republic

Authors: L. Vítek

Abstract:

The Czech Republic has over the past decade carried out two waves of tax and benefit reforms. The first one took place in 2005–2006 during the left-wing government and the second one has been carried out in 2008 by the right-wing government. Using EUSILC data for selected types of households, the paper assesses changes in the distribution of gross incomes and effects of the changes in taxes and benefits on the distribution of incomes after taxes and a provision of social benefits. The analysis is carried out on four types of households with and without children. The analysis is performed using Lorenz curves and Gini coefficients. The results show that the tax system changes the distribution of incomes less significantly than benefits. The 2006 reform reduced the differential between the Gini coefficient for the gross income and the Gini coefficient after taxes and benefits for households with active parents and one child. Reform in 2008 supported families with children and an reduced the differential between the gross income and income after taxes and benefits for different types of families.

Keywords: Czech Republic, redistribution, tax reforms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1037

2 Nonparametric Control Chart Using Density Weighted Support Vector Data Description

Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek

Abstract:

In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.

Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2112

1 Trimmed Mean as an Adaptive Robust Estimator of a Location Parameter for Weibull Distribution

Authors: Carolina B. Baguio

Abstract:

One of the purposes of the robust method of estimation is to reduce the influence of outliers in the data, on the estimates. The outliers arise from gross errors or contamination from distributions with long tails. The trimmed mean is a robust estimate. This means that it is not sensitive to violation of distributional assumptions of the data. It is called an adaptive estimate when the trimming proportion is determined from the data rather than being fixed a “priori-. The main objective of this study is to find out the robustness properties of the adaptive trimmed means in terms of efficiency, high breakdown point and influence function. Specifically, it seeks to find out the magnitude of the trimming proportion of the adaptive trimmed mean which will yield efficient and robust estimates of the parameter for data which follow a modified Weibull distribution with parameter λ = 1/2 , where the trimming proportion is determined by a ratio of two trimmed means defined as the tail length. Secondly, the asymptotic properties of the tail length and the trimmed means are also investigated. Finally, a comparison is made on the efficiency of the adaptive trimmed means in terms of the standard deviation for the trimming proportions and when these were fixed a “priori". The asymptotic tail lengths defined as the ratio of two trimmed means and the asymptotic variances were computed by using the formulas derived. While the values of the standard deviations for the derived tail lengths for data of size 40 simulated from a Weibull distribution were computed for 100 iterations using a computer program written in Pascal language. The findings of the study revealed that the tail lengths of the Weibull distribution increase in magnitudes as the trimming proportions increase, the measure of the tail length and the adaptive trimmed mean are asymptotically independent as the number of observations n becomes very large or approaching infinity, the tail length is asymptotically distributed as the ratio of two independent normal random variables, and the asymptotic variances decrease as the trimming proportions increase. The simulation study revealed empirically that the standard error of the adaptive trimmed mean using the ratio of tail lengths is relatively smaller for different values of trimming proportions than its counterpart when the trimming proportions were fixed a 'priori'.

Keywords: Adaptive robust estimate, asymptotic efficiency, breakdown point, influence function, L-estimates, location parameter, tail length, Weibull distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2063