Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31430
Human Action Recognition Using Variational Bayesian HMM with Dirichlet Process Mixture of Gaussian Wishart Emission Model

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park


In this paper, we present the human action recognition method using the variational Bayesian HMM with the Dirichlet process mixture (DPM) of the Gaussian-Wishart emission model (GWEM). First, we define the Bayesian HMM based on the Dirichlet process, which allows an infinite number of Gaussian-Wishart components to support continuous emission observations. Second, we have considered an efficient variational Bayesian inference method that can be applied to drive the posterior distribution of hidden variables and model parameters for the proposed model based on training data. And then we have derived the predictive distribution that may be used to classify new action. Third, the paper proposes a process of extracting appropriate spatial-temporal feature vectors that can be used to recognize a wide range of human behaviors from input video image. Finally, we have conducted experiments that can evaluate the performance of the proposed method. The experimental results show that the method presented is more efficient with human action recognition than existing methods.

Keywords: Human action recognition, Bayesian HMM, Dirichlet process mixture model, Gaussian-Wishart emission model, Variational Bayesian inference, Prior distribution and approximate posterior distribution, KTH dataset.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734


[1] Z. Ghahramani, An Introduction to Hidden Markov Models and Bayesian Networks, International Journal of Pattern Recognition and Artificial Intelligence. Vol. 15(1), pp. 9-42, 2001.
[2] L. Bahl and F. Jelinek, Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition, IEEE Transaction on Information Theory, vol. 21, pp 404-411, 1975.
[3] B. H. Juang and L. R. Rabiner, Hidden Markov models for speech recognition, Technometrics, vol. 33, pp 251-272, 1991.
[4] P. Smyth, D. Heckerman, and M. I. Jordan, Probabilistic independent networks for hidden Markov probability models, Neural Computation, vol. 9, pp 227-269, 1997.
[5] Z. Ghahramani and M. I. Jordan, Factorial hidden Markov models, machine Learning, vol. 29, pp 245-273, 1997.
[6] M. J. Jordan, Z. Ghahramani, and L. K. saul, hidden Markov decision trees, In M. C. Mozer, M. I. Jordan, and T. Petsche, editors, Advances in Neural Information Processing sytems, vol. 9, Cambridge, MA, 1997, MIT Press.
[7] Z. Ghahramani and G. E. Hinton, Variational learning for switching state-space models, Neural Computation, vol.24, no. 4, 2000.
[8] M. J. Beal, Variational Algorithm for Approximate Bayesian inference, A Thesis submitted for the degree of Doctor of Philosophy of the University of London, 2003.
[9] J. Yin and Y. Meng, Human Activity Recognition in Video using a Hierarchical Probabilistic Latent Model, Computer Vision and Pattern Recognition Workshop (CVPRW), San Francisco, CA, USA, 2010.
[10] Y. Tian, L. Cao, Z. Liu and Z. Zhang, “Hierarchical Filtered Motion for Action Recognition in Crowded Videos,” IEEE Trans. On System, Man, and Cybernectics, Part C: Applications and Reviews, vol. 3, no. 2, pp. 1-11, 2011.
[11] M. Z. Uddin, N. D. Thang, J. T. Kim, and T.-S. Kim, “Human Activity Recognition Using Body Joint-Angle Features and Hidden Markov Model,” ETRI Journal, vol. 33, no. 4, pp569-579, 2011.
[12] M. K. Gaikward, and Mr. V. Narawade, “HMM classifier for human activity recognition”, Computer Science & Engineering: An International Journal, vol. 2, no. 4, pp 27-36, 2012
[13] L. Piyathilaka and S. Kodagoda, “Gaussian Mixture Based HMM for Human Daily Activity Recognition Using 3D Skeleton Features,” 2013 IEEE 8th Conference on Industrial Electronics and Application, pp567-572,2013.
[14] H. Wang, M. M. Ullah, A. Klaser, I. Laptev, and C. Schmid, “Evaluation of local spatio-temporal features for action recognition,” pp1-11, 200.