Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32586
End-to-End Spanish-English Sequence Learning Translation Model

Authors: Vidhu Mitha Goutham, Ruma Mukherjee


The low availability of well-trained, unlimited, dynamic-access models for specific languages makes it hard for corporate users to adopt quick translation techniques and incorporate them into product solutions. As translation tasks increasingly require a dynamic sequence learning curve; stable, cost-free opensource models are scarce. We survey and compare current translation techniques and propose a modified sequence to sequence model repurposed with attention techniques. Sequence learning using an encoder-decoder model is now paving the path for higher precision levels in translation. Using a Convolutional Neural Network (CNN) encoder and a Recurrent Neural Network (RNN) decoder background, we use Fairseq tools to produce an end-to-end bilingually trained Spanish-English machine translation model including source language detection. We acquire competitive results using a duo-lingo-corpus trained model to provide for prospective, ready-made plug-in use for compound sentences and document translations. Our model serves a decent system for large, organizational data translation needs. While acknowledging its shortcomings and future scope, it also identifies itself as a well-optimized deep neural network model and solution.

Keywords: Attention, encoder-decoder, Fairseq, Seq2Seq, Spanish, translation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 369


[1] T. Strauß, “Decoding the output of neural networks - a discriminative approach,” Ph.D. dissertation, University of Rostock, 2017.
[2] Ron J. Weiss, Jan Chorowski, Navdeep Jaitly, Yonghui Wu, Zhifeng Chen “Sequence-to-Sequence Models Can Directly Translate Foreign Speech” arXiv:1703.08581v2 (cs.CL) 12 Jun 2017
[3] J. Poulos and R. Valle, “Attention networks for image-to-text,” CoRR, vol. abs/1712.04046, 2017
[4] Y. Zhang, W. Chan, and N. Jaitly, “Very deep convolutional networks for end-to-end speech recognition,” in Proceedings of ICASSP, 2017.
[5] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” ICLR, 12 2014.
[6] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems, 2014, pp. 3104–3112
[7] A. Sriram, H. Jun, S. Satheesh, and A. Coates, “Cold fusion: Training seq2seq models together with language models,” CoRR, vol. abs/1708.06426, 2017
[8] Ofir Press and Lior Wolf “Using the Output Embedding to Improve Language Models” arXiv:1608.05859v3 (cs.CL) 21 Feb 2017
[9] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in International Conference on Artificial Intelligence and Statistics, 05 2010, pp. 249–256
[10] G. Kumar, G. W. Blackwood, J. Trmal, D. Povey, and S. Khudanpur, “A coarse-grained model for optimal coupling of ASR and SMT systems for speech translation.” in Proceedings of EMNLP, 2015, pp. 1902–1907.
[11] E. Vidal, “Finite-state speech-to-speech translation,” in Proceedings of ICASSP, vol. 1. IEEE, 1997.
[12] F. Casacuberta, H. Ney, F. J. Och, E. Vidal, J. M. Vilar, S. Barrachina, I. Garcıa-Varea, D. Llorens, C. Martınez, S. Molau et al., “Some approaches to statistical and finite-state speech-to-speech translation,” in Computer Speech & Language, vol. 18, no. 1, pp. 25–47, 2004.