Recognizing an Individual, Their Topic of Conversation, and Cultural Background from 3D Body Movement
Authors: Gheida J. Shahrour, Martin J. Russell
Abstract:
The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that intersubject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.
Keywords: Person Recognition, Topic Recognition, Culture Recognition, 3D Body Movement Signals, Variability Compensation.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1099504
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174References:
[1] R. A., Bolt, “Put that there: Voice and gesture at the graphics interface”, Computer Graphics, vol. 14(3), pp. 262-270, 1980
[2] D. McNeil, “Hand and Mind: What gestures Reveal about Thought”, p.37, 1992
[3] D. McNeill, “So You Think Gestures are Nonverbal?”, Psychological review, vol. 92 (3), p. 350, 1985
[4] M. Rehm, N. Bee, and E. André, “Wave like an Egyptian: accelerometer based gesture recognition for culture specific interactions”, In Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction, vol. 1, pp. 13- 22, 2008.
[5] F. Cavicchio and S. Kita, “English/Italian Bilinguals Switch Gesture Parameters when they Switch Languages” in The Proceedings of Tilburg Gesture Research Meeting, pp. 305-309, 2013
[6] G. W. Allport, “Personality: A psychological Interpretation”, New York: Holt, 1937
[7] S. Bird, S. Browning, R. Moore and M. J. Russell, “Dialogue move recognition using topic spotting techniques”, Proc. ESCA Workshop on Spoken Dialogue Processing -Theory and Practic, Vigsø, Denmark, pp. 45-48, 1995
[8] B. Dhillon, R. Kocielnik, I. Politis, M. Swerts, and D. Szostak, “Culture and facial expressions: a case study with a speech interface”, In Human- Computer Interaction–INTERACT 2011, Springer Berlin Heidelberg, pp. 392-404, 2011.
[9] G. Zen, E. Sangineto, E. Ricci, and N. Sebe, “Unsupervised Domain Adaptation for Personalized Facial Emotion Recognition”. In Proceedings of the 16th International Conference on Multimodal Interaction, pp. 128-135, 2014
[10] A. Hanani, M.J. Russell and M. J. Carey, “Human and computer recognition of regional accents and ethnic groups from British English speech”. Computer Speech & Language, vol. 27(1), 59-74, 2013.
[11] K. C. Sim, S. Zhao, K. Yu, and H. Liao, “ICMI'12 grand challenge: haptic voice recognition”, In Proceedings of the 14th ACM international conference on Multimodal interaction, pp. 363-370, 2012.
[12] G. W. Allport, “Pattern Growth in Personality”, New York: Holt, Rinehart, and Winston, p48, 1937.
[13] G. W. Allport and P. E. Vernon, “Studies in Expressive Movement”, p28, 1933.
[14] S. D. Kelly, S. M. Manning and S. Rodak, “Gesture Gives a Hand to Language and Learning: Perspectives from Cognitive Neuroscience, Developmental Psychology and Education”, Language and Linguistics Compass, vol. 2(4), pp. 569-588, 2008.
[15] L. S. Nguyen, A. Marcos-Ramiro, M. Marrón Romera, and D. Gatica- Perez, D. “Multimodal analysis of body communication cues in employment interviews”. In Proceedings of the 15th ACM on International conference on multimodal interaction, pp. 437-444, 2013.
[16] M. Valstar, “Automatic Behaviour Understanding in Medicine”. International Conference on Multimodal Interaction (ICMI), 2014
[17] B. De Raad and H. Schouwenburg, “Personality in Learning and Education: A Review”, vol. 10, pp. 303-336, 1996.
[18] A. Pena, and A. De Antonio, “Inferring interaction to support collaborative learning in 3D virtual environments through the user's avatar Non-Verbal Communication”, International Journal of Technology Enhanced Learning, vol, 2(1), pp. 75-90, 2010.
[19] E. S. Kluft, J. Poteat and R. P. Kluft, “Movement observations in multiple personality disorder: A preliminary report”, American Journal of Dance Therapy, vol. 9 (1), pp. 31-46, 2006.
[20] L. Chittaro and M. Serra, “Behavioural Programming of Autonomous Characters Based on Probabilistic Automata and Personality”, vol. 15(3- 4), pp. 319-326, 2004.
[21] H. Kim, S. S. Kwak, and M. Kim, “Personality design of sociable robots by control of gesture design factors”. In Robot and Human Interactive Communication, 2008. RO-MAN 2008. The 17th IEEE International Symposium on, pp. 494-499, 2008
[22] M. S. Nixon, and J. N. Carter, “Automatic recognition by gait”, Proceedings of the IEEE, vol. 94(11), pp. 2013-2024, 2006.
[23] J. P. Singh and S. Jain, “Person Identification Based on Gait using Dynamic Body Parameters”, In Trendz in Information Sciences & Computing (TISC), pp. 248-252, 2010.
[24] A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition”, Circuits and Systems for Video Technology, IEEE Transactions on, vol. 14(1), pp. 4-20, 2004
[25] I. B. Myers and P. B. Myers, “Gifts differing: Understanding Personality Type”, Nicholas Brealey Publishing, 2010
[26] I. Jraidi, M. Chaouachi and C. Frasson, “A dynamic multimodal approach for assessing learners' interaction experience”, In Proceedings of the 15th ACM on International conference on multimodal interaction pp. 271-278, ACM, 2013.
[27] W. M. C. J. P. Campbell, D. A. Reynolds, E. Singer and P. A. Torres- Carrasquillo, “Support vector machines for speaker and language recognition”, Computer Speech & Language, vol. 20 (2), pp. 210-229, 2006
[28] A. Hanani, M. Carey and M. J. Russell, “Improved language recognition using mixture components statistics”, In INTERSPEECH, pp. 741-744, 2010.
[29] R. Vogt and S. Sridharan, “Experiments in session variability modelling for speaker verification”, In Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on vol. 1, pp. I-I, 2006.
[30] W. Campbel, D. Sturim, D. Reynolds and A. Solomonoff, “SVM based speaker verification using a GMM supervector kernel and NAP variability compensation”. In Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on vol. 1, pp. I-I, 2006.
[31] C. Vair, D. Colibro, F. Castaldo, E. Dalmasso, and P. Laface, P. “Channel factors compensation in model and feature domain for speaker recognition”, In Speaker and Language Recognition Workshop. IEEE Odyssey, pp. 1-6, 2006.
[32] Qualisys, “Qualisys Track Manager”, QTM, 2006
[33] J. C. Gower, “Euclidean Distance Matrices’ in ‘Convex optimization and Euclidean distance geometry”, J. Dattorro, ed., Lulu. Com, 2008
[34] J. Gauvain, and L., Lee, C., “Maximum A-Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains”, IEEE Transactions on Speech and Audio Processing vol.2, pp. 291–298, 1994.
[35] C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer, New York, p84-87, 2006.
[36] S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy, “SVM and Kernel Methods Matlab Toolbox”, Perception Systèmes et Information, INSA de Rouen, Rouen, France, http://asi.insa-rouen.fr/ enseignants/arakotom/toolbox/, 2005.
[37] R. Vergin, A. Farhat and D. O'Shaughnessy, “Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification”, In Spoken Language, ICSLP 96, Proceedings, Fourth International Conference vol. 2, pp. 1081-1084, 1996.