On the Performance of Information Criteria in Latent Segment Models

Jaime R. S. Fonseca

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32804

On the Performance of Information Criteria in Latent Segment Models

Authors: Jaime R. S. Fonseca

Abstract:

Nevertheless the widespread application of finite mixture models in segmentation, finite mixture model selection is still an important issue. In fact, the selection of an adequate number of segments is a key issue in deriving latent segments structures and it is desirable that the selection criteria used for this end are effective. In order to select among several information criteria, which may support the selection of the correct number of segments we conduct a simulation study. In particular, this study is intended to determine which information criteria are more appropriate for mixture model selection when considering data sets with only categorical segmentation base variables. The generation of mixtures of multinomial data supports the proposed analysis. As a result, we establish a relationship between the level of measurement of segmentation variables and some (eleven) information criteria-s performance. The criterion AIC3 shows better performance (it indicates the correct number of the simulated segments- structure more often) when referring to mixtures of multinomial segmentation base variables.

Keywords: Quantitative Methods, Multivariate Data Analysis, Clustering, Finite Mixture Models, Information Theoretical Criteria, Simulation experiments.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1333100

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475

References:

[1] H. Akaike, Information Theory and an Extension of Maximum Likelihood Principle, in K. T. Emanuel Parzen, Genshiro Kitagawa, ed., Selected Papers of Hirotugu Akaike, in Proceedings of the Second International Symposium on Information Theory, B.N. Petrov and F. caski, eds., Akademiai Kiado, Budapest, 1973, 267-281, Springer- Verlag New York, Inc, Texas, 1973, pp. 434.
[2] J. D. Banfield and A. E. Raftery, Model-Based Gaussian and Non- Gaussian Clustering, Biometrics, 49 (1993), pp. 803-821.
[3] C. Biernacki, Choix de modéles en Classification, PhD Thesis., Compiègne University of Technology, 1997.
[4] C. Biernacki, G. Celeux and G. Govaert, Assessing a Mixture model for Clustering with the integrated Completed Likelihood, IEEE Transactions on Pattern analysis and Machine Intelligence, 22 (2000), pp. 719-725.
[5] C. Biernacki, G. Celeux and G. Govaert, Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Computational Statistics & Data Analysis, 41 (2003), pp. 561-575.
[6] C. Biernacki, G. Celeux and G. Govaert, An improvement of the NEC criterion for assessing the number of clusters in mixture model, Pattern Recognition Letters, 20 (1999), pp. 267-272.
[7] D. Böhning and W. Seidel, Editorial: recent developments in mixture models, Computational Statistics & Data Analysis, 41 (2003), pp. 349- 357.
[8] S. Boucheron and E. Gassiat, Order Estimation and Model Selection, in e. O. C. A. T. Ryden, ed., Inference in Hidden Markov, 2002, pp. 25.
[9] H. Bozdogan, Mixture-Model Cluster Analysis using Model Selection criteria and a new Informational Measure of Complexity, in H. Bozdogan, ed., Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Approach, 69-113, Kluwer Academic Publishers, 1994, pp. 69-113.
[10] H. Bozdogan, Model Selection and Akaikes's Information Criterion (AIC): The General Theory and its Analytical Extensions, Psycometrika, 52 (1987), pp. 345-370.
[11] H. Bozdogan, Proceedings of the first US/Japan conference on the Frontiers of Statistical Modeling: An Informational Approach, Kluwer Academic Publishers, Dordrecht, 1994.
[12] G. Celeux and G. Soromenho, An entropy criterion for acessing the number of clusters in a mixture model, Journal of Classification, 13 (1996), pp. 195-212.
[13] W. J. Conover, Practical Nonparametric Statistics, John Wiley & Sons, Inc., New York, 1980.
[14] N. E. Day, Estimating the Components of a mixture of normal Distributions, Biometrika, 56 (1969), pp. 463-474.
[15] A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum Likelihood from incomplete Data via EM algorithm, Journal of the Royal Statistics Society, B, 39 (1977), pp. 1-38.
[16] J. G. Dias and F. Willekens, Model-based Clustering of Sequential Data with an Application to Contraceptive Use Dynamics, Mathematical Population Studies, 12 (2005), pp. 135-157.
[17] W. R. Dillon and A. Kumar, Latent structure and other mixture models in marketing: An integrative survey and overview, chapter 9 in R.P. Bagozi (ed.), Advanced methods of Marketing Research, 352-388, Cambridge: Blackwell Publishers, 1994.
[18] M. A. T. Figueiredo and A. K. Jain, Unsupervised Learning of Finite Mixture Models, IEEE Transactions on pattern analysis and Machine Intelligence, 24 (2002), pp. 1-16.
[19] J. R. S. Fonseca and M. G. M. S. Cardoso, Mixture-Model Cluster Analysis using Information Theoretical Criteria, Intelligent Data Analysis, 11 (2007), pp. 155-173.
[20] M. Friedman, The use of ranks to avoid the assumption of normality implicit in the analysis of variance, Journal of American Statistical Association, 32 (1937), pp. 675-701.
[21] J. G. Fryer, and Robertson, C.A., A Comparision of Some methods for Estimating Mixed Normal Distributions, Biometrika, 59 (1972), pp. 639- 648.
[22] P. Hall and D. M. Titterington, Efficient Nonparametric Estimation of Mixture Proportions, Journal of the Royal Statistical Society, Series B, 46 (1984), pp. 465-473.
[23] R. J. Hataway, A Constrained Formulation of Maximum-Likelihood Estimation for Normal Mixture Distributions, The Annals of Statistics, 13 (1985), pp. 795-800.
[24] L. A. Hunt and K. E. Basford, Fitting a Mixture Model to Three-Mode Trhee-Way Data with Categorical and Continuous Variables, Journal of Classification, 16 (1999), pp. 283-296.
[25] C. M. Hurvich and C.-L. Tsai, Regression and Time Series Model Selection in Small Samples, Biometrika, 76 (1989), pp. 297-307.
[26] L. F. James, C. E. Priebe and D. J. Marchette, Consistency Estimation of Mixture Complexity, The Annals of Statistics, 29 (2001), pp. 1281- 1296.
[27] A. B. M. L. Kabir, Estimation of Parameters of a finite Mixture of Distributions, Journal of the Royal Statistical Society, Series B, 30 (1968), pp. 472-482.
[28] D. Karlis and E. Xekalaki, Choosing initial values for the EM algorithm for finite mixtures, Computational Statistics & Data Analysis, 41 (2003), pp. 577-590.
[29] C. Keribin, Estimation consistante de l'orde de modèles de mélange, Comptes Rendues de l'Academie des Sciences, Paris, t. 326, Série I (1998), pp. 243-248.
[30] Y. Kim, W. N. Street and F. Menezer, Evolutionary model selection in unsupervised learning, Intelligent Data Analysis, 6 (2002), pp. 531-556.
[31] B. G. Leroux, Consistent Estimation of a Mixing Distribution, The Annals of Statistics, 20 (1992), pp. 1350-1360.
[32] B. G. Leroux and M. L. Puterman, Maximum-Penalized-Likelihood Estimation for Independent and Markov-Dependent Mixture Models, Biometrics, 48 (1992), pp. 545-558.
[33] G. McLachlan and T. Krishnan, The EM Algorithm and Extensions, John Wiley & Sons, New York, 1997.
[34] G. F. McLachlan and D. Peel, Finite Mixture Models, John Wiley & Sons., 2000.
[35] G. J. McLachlan and K. E. Basford, Mixture Models: Inference and Applications to Clustering., Marcel Deckker, Inc., New York, 1988.
[36] A. McQuarrie, R. Shumway and C.-L. Tsai, The model selection criterion AICu, Statistics & Probability Letters, 34 (1997), pp. 285-292.
[37] G. Punj and D. W. Stewart, Cluster Analysis in Marketing Research: Review and Suggestions for Application, Journal of Marketing Research, XX (May 1983) (1983), pp. 134-148.
[38] J. Rissanen, Modeling by shortest data description, Automatica, 14 (1978), pp. 465-471.
[39] G. Schwarz, Estimating the Dimenson of a Model, The Annals of Statistics, 6 (1978), pp. 461-464.
[40] J. K. Vermunt and J. Magidson, Latent class cluster analysis., J.A. Hagenaars and A.L. McCutcheon (eds.), Applied Latent Class Analysis, 89-106., Cambridge University Press, 2002.
[41] H. x. Wang, Q. b. Zhang, B. Luo and S. Wei, Robust mixture modelling using multivariate t-distribution with missing information, Pattern Recognition Letters, 25 (2004), pp. 701-710.
[42] D. L. Weakliem, A critique of the Bayesian Criterion for Model Selection, Sociological Methodology & Research, 27 (1999), pp. 359- 397.