Study of Syntactic Errors for Deep Parsing at Machine Translation
Authors: Yukiko Sasaki Alam, Shahid Alam
Abstract:
Syntactic parsing is vital for semantic treatment by many applications related to natural language processing (NLP), because form and content coincide in many cases. However, it has not yet reached the levels of reliable performance. By manually examining and analyzing individual machine translation output errors that involve syntax as well as semantics, this study attempts to discover what is required for improving syntactic and semantic parsing.
Keywords: Machine translation, error analysis, syntactic errors, knowledge required for parsing.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1129229
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247References:
[1] Baldwin, T., Bannard, C., Tanaka, T. and Widdows, D. 2003. An Empirical Model of Multiword Expression Decomposability. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions Analysis, Acquisition and Treatment. 89-96.
[2] Bunt, H. and A Van Horck. 1996. Discontinuous Constituency. Mouton De Gruyter.
[3] Elliot, D, Hartley, A., and Atwell, E. 2004. A Fluency Error Categorization Scheme to Guide Automated Machine Translation Evaluation. AMTA 2004. Pages 64-73.
[4] Church, K. 2013. How Many Multiword Expressions Do People Know? ACM Transactions on Speech and Language Processing. 10(2), Article 4: 1-13.
[5] Fillmore, C. J., Kay, P., and O’Connor, M. C. 1988. Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone. Language, 64 (3), 501-538.
[6] Farrús, M., Costa-jussa, M. R., Marino, J. B., Posh, M., Hernandez, A., Henriquez, C., and Fonollosa, J. A. R. 2011. Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan-Spanish language pair. Language Resources and Evaluation (Springer). Vol. 45 Issue 2.
[7] Flanagan, M. 1994. Error classification for MT evaluation. AMTA 1994. 65-72.
[8] Goldberg, A. E. 1995. A Construction Grammar Approach to Argument Structure. Chicago: The University of Chicago Press.
[9] Goldberg, A. E. 2006. Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.
[10] Hilpert, M. 2014. Construction Grammar and its Application to English. Edinburgh: Edinburgh University Press.
[11] Hogan, D. 2007. Coordinate Noun Phrase Disambiguation in a Generative Parsing Model. Proc. of the 45th Annual Meeting of the ACL, 680-687.
[12] Hunston, S. and Francis, G. 2000. Pattern Grammar A corpus-driven approach to the lexical grammar of English. Benjamins Publishing Co.
[13] Hurskainen, A. 2008 Multiword Expressions and Machine Translation. Technical Reports in Language Technology Report No 1, 2008. http://www.njas.helsinki.fi/salama/multiword-expressions-and-machine-translation.pdf (access date: Nov. 28, 2016).
[14] Kim, S. and Baldwin, T. 2013. Word Sense and Semantic Relations in Noun Compounds. ACM Transactions on Speech and Language Processing. 10(3), Article 9: 1-17.
[15] Kordoni, V. and Simova, I. 2014. Multiword Expressions in Machine Translation. LREC 2014. 1208-1211.
[16] Lau, J., Baldwin, T., and Hewman, D. 2013. On Collocations and Topic Models. ACM Transactions on Speech and Language Processing. 10(3), Article 10: 1-14.
[17] Metzler, D. P., Haas, S. W., and Cosic, C. L. Conjuction, Ellipsis, and Other Discontinuous Constituents in the Constituent Object Parser. Information Processing & Management, 26 (1): 53-71.
[18] Nadeau, D. and Sekine, S. 2007. A survey of named entity recognition and classification. Linguisticae Investigationes. 30(1):3-26.
[19] Petrov, S and McDonald, R. 2012. Overview of the 2012 Shared Task on Parsing the Web. Notes of the First Workshop on Syntactic Analysis.
[20] Popović, M. and Burchardt, A. 2011. From Human to Automatic Error Classification for Machine Translation Output. Proceedings of the 15th Conference of the European Association for Machine Translation. 265-272.
[21] Ramisch, C., Villavicencio, A., and Kordoni, V. 2013. Introduction to the Special Issue on Multiword Expressions: From Theory to Practice and Use. ACM Transactions on Speech and Language Processing. 10(2), Article 3: 1-10.
[22] Resnik, P. 1999. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Jr. of Artificial Intelligence Research 11. 95-130.
[23] Sag, I., Baldwin, T., Bond, F., Copestake, A, and Flickinger, D. 2002. Multiword Expressions: A Pain in the Neck for NLP, In Proc. of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2002), pages 1–15, Mexico City, Mexico.
[24] Shutova, E., Kaplan, J., Teufel, S., and Korhonen, A. 2013. A Computational Model of Logical Metonymy. ACM Transactions on Speech and Language Processing. 10(3), Article 11:1-2.
[25] Stymne, S. and Ahrenberg, L. 2012. On the practice of error analysis for machine translation evaluation LREC 2012.
[26] Vilar, D., Xu, J., D’Haro, L., and Ney, H. 2006. Error Analysis of Statistical Machine Translation Output. Proceedings of the LREC. 697-702.
[27] Yoshimoto, A, Hara, K., Shimbo, M., Matsumoto, Y. 2015. Coordination-aware Dependency Parsing (Preliminary Report) Proc. Of the 14th International Conference on Parsing Technologies, pages 66-70.
[28] Google Language Tools at https://translate.google.com/ access dates: March-April, 2016.
[29] Alam, Y. 2017. Knowledge Required for Avoiding Lexical Errors at Machine Translation. International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering, February 2017, 7 pages.