Search results for: post-editing machine translation output
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5245

Search results for: post-editing machine translation output

5245 Chinese Undergraduates’ Trust in And Usage of Machine Translation: A Survey

Authors: Bi Zhao

Abstract:

Neural network technology has greatly improved the output of machine translation in terms of both fluency and accuracy, which greatly increases its appeal for young users. The present exploratory study aims to find out how the Chinese undergraduates perceive and use machine translation in their daily life. A survey is conducted to collect data from 100 undergraduate students from multiple Chinese universities and with varied academic backgrounds, including arts, business, science, engineering, and medicine. The survey questions inquire about their use (including frequency, scenarios, purposes, and preferences) of and attitudes (including trust, quality assessment, justifications, and ethics) toward machine translation. Interviews and tasks of evaluating machine translation output are also employed in combination with the survey on a sample of selected respondents. The results indicate that Chinese undergraduate students use machine translation on a daily basis for a wide range of purposes in academic, communicative, and entertainment scenarios. Most of them have preferred machine translation tools, but the availability of machine translation tools within a certain scenario, such as the embedded machine translation tool on the webpage, is also the determining factor in their choice. The results also reveal that despite the reportedly limited trust in the accuracy of machine translation output, most students lack the ability to critically analyze and evaluate such output. Furthermore, the evidence is revealed of the inadequate awareness of ethical responsibility as machine translation users among Chinese undergraduate students.

Keywords: Chinese undergraduates, machine translation, trust, usage

Procedia PDF Downloads 139
5244 Perception and Implementation of Machine Translation Applications by the Iranian English Translators

Authors: Abdul Amir Hazbavi

Abstract:

The present study is an attempt to provide a relatively comprehensive preview of the Iranian English translators’ perception on Machine Translation. Furthermore, the study tries to shed light on the status of implementation of Machine Translation among the Iranian English Translators. To reach the aforementioned objectives, the Localization Industry Standards Association’s questioner for measuring perceptions with regard to the adoption of a technology innovation was adapted and used to investigate three parameter among the participants of the study, namely familiarity with Machine Translation, general perception on Machine Translation and implementation of Machine Translation systems in translation tasks. The participants of the study were 224 last-year undergraduate Iranian students of English translation at 10 universities across the country. The study revealed a very low level of adoption and a very high level of willingness to get familiar with and learn about Machine Translation, as well as a positive perception of and attitude toward Machine Translation by the Iranian English translators.

Keywords: translation technology, machine translation, perception, implementation

Procedia PDF Downloads 523
5243 Study of Syntactic Errors for Deep Parsing at Machine Translation

Authors: Yukiko Sasaki Alam, Shahid Alam

Abstract:

Syntactic parsing is vital for semantic treatment by many applications related to natural language processing (NLP), because form and content coincide in many cases. However, it has not yet reached the levels of reliable performance. By manually examining and analyzing individual machine translation output errors that involve syntax as well as semantics, this study attempts to discover what is required for improving syntactic and semantic parsing.

Keywords: syntactic parsing, error analysis, machine translation, deep parsing

Procedia PDF Downloads 560
5242 The Effect of Using Computer-Assisted Translation Tools on the Translation of Collocations

Authors: Hassan Mahdi

Abstract:

The integration of computer-assisted translation (CAT) tools in translation creates several opportunities for translators. However, this integration is not useful in all types of English structures. This study aims at examining the impact of using CAT tools in translating collocations. Seventy students of English as a foreign language participated in this study. The participants were divided into three groups (i.e., CAT tools group, Machine Translation group, and the control group). The comparison of the results obtained from the translation output of the three groups demonstrated the improvement of translation using CAT tools. The results indicated that the participants who used CAT tools outscored the participants who used MT, and in turn, both groups outscored the control group who did not use any type of technology in translation. In addition, there was a significant difference in the use of CAT for translation different types of collocations. The results also indicated that CAT tools were more effective in translation fixed and medium-strength collocations than weak collocations. Finally, the results showed that CAT tools were effective in translation collocations in both types of languages (i.e. target language or source language). The study suggests some guidelines for translators to use CAT tools.

Keywords: machine translation, computer-assisted translation, collocations, technology

Procedia PDF Downloads 193
5241 Statistical Comparison of Machine and Manual Translation: A Corpus-Based Study of Gone with the Wind

Authors: Yanmeng Liu

Abstract:

This article analyzes and compares the linguistic differences between machine translation and manual translation, through a case study of the book Gone with the Wind. As an important carrier of human feeling and thinking, the literature translation poses a huge difficulty for machine translation, and it is supposed to expose distinct translation features apart from manual translation. In order to display linguistic features objectively, tentative uses of computerized and statistical evidence to the systematic investigation of large scale translation corpora by using quantitative methods have been deployed. This study compiles bilingual corpus with four versions of Chinese translations of the book Gone with the Wind, namely, Piao by Chunhai Fan, Piao by Huairen Huang, translations by Google Translation and Baidu Translation. After processing the corpus with the software of Stanford Segmenter, Stanford Postagger, and AntConc, etc., the study analyzes linguistic data and answers the following questions: 1. How does the machine translation differ from manual translation linguistically? 2. Why do these deviances happen? This paper combines translation study with the knowledge of corpus linguistics, and concretes divergent linguistic dimensions in translated text analysis, in order to present linguistic deviances in manual and machine translation. Consequently, this study provides a more accurate and more fine-grained understanding of machine translation products, and it also proposes several suggestions for machine translation development in the future.

Keywords: corpus-based analysis, linguistic deviances, machine translation, statistical evidence

Procedia PDF Downloads 144
5240 Direct Translation vs. Pivot Language Translation for Persian-Spanish Low-Resourced Statistical Machine Translation System

Authors: Benyamin Ahmadnia, Javier Serrano

Abstract:

In this paper we compare two different approaches for translating from Persian to Spanish, as a language pair with scarce parallel corpus. The first approach involves direct transfer using an statistical machine translation system, which is available for this language pair. The second approach involves translation through English, as a pivot language, which has more translation resources and more advanced translation systems available. The results show that, it is possible to achieve better translation quality using English as a pivot language in either approach outperforms direct translation from Persian to Spanish. Our best result is the pivot system which scores higher than direct translation by (1.12) BLEU points.

Keywords: statistical machine translation, direct translation approach, pivot language translation approach, parallel corpus

Procedia PDF Downloads 487
5239 Machine Translation Analysis of Chinese Dish Names

Authors: Xinyu Zhang, Olga Torres-Hostench

Abstract:

This article presents a comparative study evaluating and comparing the quality of machine translation (MT) output of Chinese gastronomy nomenclature. Chinese gastronomic culture is experiencing an increased international acknowledgment nowadays. The nomenclature of Chinese gastronomy not only reflects a specific aspect of culture, but it is related to other areas of society such as philosophy, traditional medicine, etc. Chinese dish names are composed of several types of cultural references, such as ingredients, colors, flavors, culinary techniques, cooking utensils, toponyms, anthroponyms, metaphors, historical tales, among others. These cultural references act as one of the biggest difficulties in translation, in which the use of translation techniques is usually required. Regarding the lack of Chinese food-related translation studies, especially in Chinese-Spanish translation, and the current massive use of MT, the quality of the MT output of Chinese dish names is questioned. Fifty Chinese dish names with different types of cultural components were selected in order to complete this study. First, all of these dish names were translated by three different MT tools (Google Translate, Baidu Translate and Bing Translator). Second, a questionnaire was designed and completed by 12 Chinese online users (Chinese graduates of a Hispanic Philology major) in order to find out user preferences regarding the collected MT output. Finally, human translation techniques were observed and analyzed to identify what translation techniques would be observed more often in the preferred MT proposals. The result reveals that the MT output of the Chinese gastronomy nomenclature is not of high quality. It would be recommended not to trust the MT in occasions like restaurant menus, TV culinary shows, etc. However, the MT output could be used as an aid for tourists to have a general idea of a dish (the main ingredients, for example). Literal translation turned out to be the most observed technique, followed by borrowing, generalization and adaptation, while amplification, particularization and transposition were infrequently observed. Possibly because that the MT engines at present are limited to relate equivalent terms and offer literal translations without taking into account the whole context meaning of the dish name, which is essential to the application of those less observed techniques. This could give insight into the post-editing of the Chinese dish name translation. By observing and analyzing translation techniques in the proposals of the machine translators, the post-editors could better decide which techniques to apply in each case so as to correct mistakes and improve the quality of the translation.

Keywords: Chinese dish names, cultural references, machine translation, translation techniques

Procedia PDF Downloads 137
5238 An Analysis of Machine Translation: Instagram Translation vs Human Translation on the Perspective Translation Quality

Authors: Aulia Fitri

Abstract:

This aims to seek which part of the linguistics with the common mistakes occurred between Instagram translation and human translation. Instagram is a social media account that is widely used by people in the world. Everyone with the Instagram account can consume the captions and pictures that are shared by their friends, celebrity, and public figures across countries. Instagram provides the machine translation under its caption space that will assist users to understand the language of their non-native. The researcher takes samples from an Indonesian public figure whereas the account is followed by many followers. The public figure tries to help her followers from other countries understand her posts by putting up the English version after the Indonesian version. However, the research on Instagram account has not been done yet even though the account is widely used by the worldwide society. There are 20 samples that will be analysed on the perspective of translation quality and linguistics tools. As the MT, Instagram tends to give a literal translation without regarding the topic meant. On the other hand, the human translation tends to exaggerate the translation which leads a different meaning in English. This is an interesting study to discuss when the human nature and robotic-system influence the translation result.

Keywords: human translation, machine translation (MT), translation quality, linguistic tool

Procedia PDF Downloads 321
5237 Improving Machine Learning Translation of Hausa Using Named Entity Recognition

Authors: Aishatu Ibrahim Birma, Aminu Tukur, Abdulkarim Abbass Gora

Abstract:

Machine translation plays a vital role in the Field of Natural Language Processing (NLP), breaking down language barriers and enabling communication across diverse communities. In the context of Hausa, a widely spoken language in West Africa, mainly in Nigeria, effective translation systems are essential for enabling seamless communication and promoting cultural exchange. However, due to the unique linguistic characteristics of Hausa, accurate translation remains a challenging task. The research proposes an approach to improving the machine learning translation of Hausa by integrating Named Entity Recognition (NER) techniques. Named entities, such as person names, locations, organizations, and dates, are critical components of a language's structure and meaning. Incorporating NER into the translation process can enhance the quality and accuracy of translations by preserving the integrity of named entities and also maintaining consistency in translating entities (e.g., proper names), and addressing the cultural references specific to Hausa. The NER will be incorporated into Neural Machine Translation (NMT) for the Hausa to English Translation.

Keywords: machine translation, natural language processing (NLP), named entity recognition (NER), neural machine translation (NMT)

Procedia PDF Downloads 43
5236 Development of a French to Yorùbá Machine Translation System

Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky

Abstract:

A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.

Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language

Procedia PDF Downloads 77
5235 Knowledge Required for Avoiding Lexical Errors at Machine Translation

Authors: Yukiko Sasaki Alam

Abstract:

This research aims at finding out the causes that led to wrong lexical selections in machine translation (MT) rather than categorizing lexical errors, which has been a main practice in error analysis. By manually examining and analyzing lexical errors outputted by a MT system, it suggests what knowledge would help the system reduce lexical errors.

Keywords: machine translation, error analysis, lexical errors, evaluation

Procedia PDF Downloads 337
5234 Fine-Tuned Transformers for Translating Multi-Dialect Texts to Modern Standard Arabic

Authors: Tahar Alimi, Rahma Boujebane, Wiem Derouich, Lamia Hadrich Belguith

Abstract:

Machine translation task of low-resourced languages such as Arabic is a challenging task. Despite the appearance of sophisticated models based on the latest deep learning techniques, namely the transfer learning and transformers, all models prove incapable of carrying out an acceptable translation, which includes Arabic Dialects (AD), because they do not have official status. In this paper, we present a machine translation model designed to translate Arabic multidialectal content into Modern Standard Arabic (MSA), leveraging both new and existing parallel resources. The latter achieved the best results for both Levantine and Maghrebi dialects with a BLEU score of 64.99.

Keywords: Arabic translation, dialect translation, fine-tune, MSA translation, transformer, translation

Procedia PDF Downloads 61
5233 Corpus-Based Neural Machine Translation: Empirical Study Multilingual Corpus for Machine Translation of Opaque Idioms - Cloud AutoML Platform

Authors: Khadija Refouh

Abstract:

Culture bound-expressions have been a bottleneck for Natural Language Processing (NLP) and comprehension, especially in the case of machine translation (MT). In the last decade, the field of machine translation has greatly advanced. Neural machine translation NMT has recently achieved considerable development in the quality of translation that outperformed previous traditional translation systems in many language pairs. Neural machine translation NMT is an Artificial Intelligence AI and deep neural networks applied to language processing. Despite this development, there remain some serious challenges that face neural machine translation NMT when translating culture bounded-expressions, especially for low resources language pairs such as Arabic-English and Arabic-French, which is not the case with well-established language pairs such as English-French. Machine translation of opaque idioms from English into French are likely to be more accurate than translating them from English into Arabic. For example, Google Translate Application translated the sentence “What a bad weather! It runs cats and dogs.” to “يا له من طقس سيء! تمطر القطط والكلاب” into the target language Arabic which is an inaccurate literal translation. The translation of the same sentence into the target language French was “Quel mauvais temps! Il pleut des cordes.” where Google Translate Application used the accurate French corresponding idioms. This paper aims to perform NMT experiments towards better translation of opaque idioms using high quality clean multilingual corpus. This Corpus will be collected analytically from human generated idiom translation. AutoML translation, a Google Neural Machine Translation Platform, is used as a custom translation model to improve the translation of opaque idioms. The automatic evaluation of the custom model will be compared to the Google NMT using Bilingual Evaluation Understudy Score BLEU. BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Human evaluation is integrated to test the reliability of the Blue Score. The researcher will examine syntactical, lexical, and semantic features using Halliday's functional theory.

Keywords: multilingual corpora, natural language processing (NLP), neural machine translation (NMT), opaque idioms

Procedia PDF Downloads 149
5232 Optimizing the Use of Google Translate in Translation Teaching: A Case Study at Prince Sultan University

Authors: Saadia Elamin

Abstract:

The quasi-universal use of smart phones with internet connection available all the time makes it a reflex action for translation undergraduates, once they encounter the least translation problem, to turn to the freely available web resource: Google Translate. Like for other translator resources and aids, the use of Google Translate needs to be moderated in such a way that it contributes to developing translation competence. Here, instead of interfering with students’ learning by providing ready-made solutions which might not always fit into the contexts of use, it can help to consolidate the skills of analysis and transfer which students have already acquired. One way to do so is by training students to adhere to the basic principles of translation work. The most important of these is that analyzing the source text for comprehension comes first and foremost before jumping into the search for target language equivalents. Another basic principle is that certain translator aids and tools can be used for comprehension, while others are to be confined to the phase of re-expressing the meaning into the target language. The present paper reports on the experience of making a measured and reasonable use of Google Translate in translation teaching at Prince Sultan University (PSU), Riyadh. First, it traces the development that has taken place in the field of translation in this age of information technology, be it in translation teaching and translator training, or in the real-world practice of the profession. Second, it describes how, with the aim of reflecting this development onto the way translation is taught, senior students, after being trained on post-editing machine translation output, are authorized to use Google Translate in classwork and assignments. Third, the paper elaborates on the findings of this case study which has demonstrated that Google Translate, if used at the appropriate levels of training, can help to enhance students’ ability to perform different translation tasks. This help extends from the search for terms and expressions, to the tasks of drafting the target text, revising its content and finally editing it. In addition, using Google Translate in this way fosters a reflexive and critical attitude towards web resources in general, maximizing thus the benefit gained from them in preparing students to meet the requirements of the modern translation job market.

Keywords: Google Translate, post-editing machine translation output, principles of translation work, translation competence, translation teaching, translator aids and tools

Procedia PDF Downloads 473
5231 An Experience of Translating an Excerpt from Sophie Adonon’s Echos de Femmes from French to English, Using Reverso.

Authors: Michael Ngongeh Mombe

Abstract:

This Paper seeks to investigate an assertion made by some colleagues that there is no need paying a human translator to translate their literary texts, that there are softwares such as Reverso that can be used to do the translation. The main objective of this study is to examine the veracity of this assertion using Reverso to translate a literary text without any post-editing by a human translator. The work is based on two theories: Skopos and Communicative theories of translation. The work is a documentary research where data were collected from published documents in libraries, on the internet and from the translation produced by Reverso. We made a comparative text analyses of both source and target texts in a bid to highlight the weaknesses and strengths of the software. Findings of this work revealed that those who advocate the use of only Machine translation do so in ignorance of the translation mistakes usually made by the software. From the review of all the 268 segments of translation, we found out that the translation produced by Reverso is fraught with errors. We therefore recommend the use of human translators to either do the translation of their literary texts or revise the translation produced by machine to conform to the skopos of the work. This paper is based on Reverso translation. Similar works in the near future will be based on the other translation softwares to determine their weaknesses and strengths.

Keywords: machine translation, human translator, Reverso, literary text

Procedia PDF Downloads 95
5230 A Pilot Study to Investigate the Use of Machine Translation Post-Editing Training for Foreign Language Learning

Authors: Hong Zhang

Abstract:

The main purpose of this study is to show that machine translation (MT) post-editing (PE) training can help our Chinese students learn Spanish as a second language. Our hypothesis is that they might make better use of it by learning PE skills specific for foreign language learning. We have developed PE training materials based on the data collected in a previous study. Training material included the special error types of the output of MT and the error types that our Chinese students studying Spanish could not detect in the experiment last year. This year we performed a pilot study in order to evaluate the PE training materials effectiveness and to what extent PE training helps Chinese students who study the Spanish language. We used screen recording to record these moments and made note of every action done by the students. Participants were speakers of Chinese with intermediate knowledge of Spanish. They were divided into two groups: Group A performed PE training and Group B did not. We prepared a Chinese text for both groups, and participants translated it by themselves (human translation), and then used Google Translate to translate the text and asked them to post-edit the raw MT output. Comparing the results of PE test, Group A could identify and correct the errors faster than Group B students, Group A did especially better in omission, word order, part of speech, terminology, mistranslation, official names, and formal register. From the results of this study, we can see that PE training can help Chinese students learn Spanish as a second language. In the future, we could focus on the students’ struggles during their Spanish studies and complete the PE training materials to teach Chinese students learning Spanish with machine translation.

Keywords: machine translation, post-editing, post-editing training, Chinese, Spanish, foreign language learning

Procedia PDF Downloads 144
5229 Enhancing Word Meaning Retrieval Using FastText and Natural Language Processing Techniques

Authors: Sankalp Devanand, Prateek Agasimani, Shamith V. S., Rohith Neeraje

Abstract:

Machine translation has witnessed significant advancements in recent years, but the translation of languages with distinct linguistic characteristics, such as English and Sanskrit, remains a challenging task. This research presents the development of a dedicated English-to-Sanskrit machine translation model, aiming to bridge the linguistic and cultural gap between these two languages. Using a variety of natural language processing (NLP) approaches, including FastText embeddings, this research proposes a thorough method to improve word meaning retrieval. Data preparation, part-of-speech tagging, dictionary searches, and transliteration are all included in the methodology. The study also addresses the implementation of an interpreter pattern and uses a word similarity task to assess the quality of word embeddings. The experimental outcomes show how the suggested approach may be used to enhance word meaning retrieval tasks with greater efficacy, accuracy, and adaptability. Evaluation of the model's performance is conducted through rigorous testing, comparing its output against existing machine translation systems. The assessment includes quantitative metrics such as BLEU scores, METEOR scores, Jaccard Similarity, etc.

Keywords: machine translation, English to Sanskrit, natural language processing, word meaning retrieval, fastText embeddings

Procedia PDF Downloads 44
5228 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis

Procedia PDF Downloads 713
5227 ALEF: An Enhanced Approach to Arabic-English Bilingual Translation

Authors: Abdul Muqsit Abbasi, Ibrahim Chhipa, Asad Anwer, Saad Farooq, Hassan Berry, Sonu Kumar, Sundar Ali, Muhammad Owais Mahmood, Areeb Ur Rehman, Bahram Baloch

Abstract:

Accurate translation between structurally diverse languages, such as Arabic and English, presents a critical challenge in natural language processing due to significant linguistic and cultural differences. This paper investigates the effectiveness of Facebook’s mBART model, fine-tuned specifically for sequence-tosequence (seq2seq) translation tasks between Arabic and English, and enhanced through advanced refinement techniques. Our approach leverages the Alef Dataset, a meticulously curated parallel corpus spanning various domains to capture the linguistic richness, nuances, and contextual accuracy essential for high-quality translation. We further refine the model’s output using advanced language models such as GPT-3.5 and GPT-4, which improve fluency, coherence, and correct grammatical errors in translated texts. The fine-tuned model demonstrates substantial improvements, achieving a BLEU score of 38.97, METEOR score of 58.11, and TER score of 56.33, surpassing widely used systems such as Google Translate. These results underscore the potential of mBART, combined with refinement strategies, to bridge the translation gap between Arabic and English, providing a reliable, context-aware machine translation solution that is robust across diverse linguistic contexts.

Keywords: natural language processing, machine translation, fine-tuning, Arabic-English translation, transformer models, seq2seq translation, translation evaluation metrics, cross-linguistic communication

Procedia PDF Downloads 7
5226 English Grammatical Errors of Arabic Sentence Translations Done by Machine Translations

Authors: Muhammad Fathurridho

Abstract:

Grammar as a rule used by every language to be understood by everyone is always related to syntax and morphology. Arabic grammar is different with another languages’ grammars. It has more rules and difficulties. This paper aims to investigate and describe the English grammatical errors of machine translation systems in translating Arabic sentences, including declarative, exclamation, imperative, and interrogative sentences, specifically in year 2018 which can be supported with artificial intelligence’s role. The Arabic sample sentences which are divided into two; verbal and nominal sentence of several Arabic published texts will be examined as the source language samples. The translated sentences done by several popular online machine translation systems, including Google Translate, Microsoft Bing, Babylon, Facebook, Hellotalk, Worldlingo, Yandex Translate, and Tradukka Translate are the material objects of this research. Descriptive method that will be taken to finish this research will show the grammatical errors of English target language, and classify them. The conclusion of this paper has showed that the grammatical errors of machine translation results are varied and generally classified into morphological, syntactical, and semantic errors in all type of Arabic words (Noun, Verb, and Particle), and it will be one of the evaluations for machine translation’s providers to correct them in order to improve their understandable results.

Keywords: Arabic, Arabic-English translation, machine translation, grammatical errors

Procedia PDF Downloads 155
5225 Literary Translation Human vs Machine: An Essay about Online Translation

Authors: F. L. Bernardo, R. A. S. Zacarias

Abstract:

The ways to translate are manifold since textual genres undergoing translations are diverse. In this essay, our goal is to give special attention to the literary genre and to the online translation tool Google Translate (GT), widely used either by nonprofessionals or by scholars, in order to show evidence of the indispensability of human wit in a good translation. Our study has its basis on a literary review of prominent authors, with emphasis on translation categories. Also highlighting the issue of polysemous literary translation, we aim to shed light on the translator’s craft and the fallible nature of online translation. To better illustrate these principles, the methodology consisted on performing a comparative analysis involving the original text Moll Flanders by Daniel Defoe in English to its online translation given by GT and to a translation into Brazilian Portuguese performed by a human. We proceeded to identifying and analyzing the degrees of textual equivalence according to the following categories: volume, levels and order. The results have attested the unsuitability in a translation done by a computer connected to the World Wide Web.

Keywords: Google Translator, human translation, literary translation, Moll Flanders

Procedia PDF Downloads 651
5224 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 147
5223 Efficiency of Google Translate and Bing Translator in Translating Persian-to-English Texts

Authors: Samad Sajjadi

Abstract:

Machine translation is a new subject increasingly being used by academic writers, especially students and researchers whose native language is not English. There are numerous studies conducted on machine translation, but few investigations have assessed the accuracy of machine translation from Persian to English at lexical, semantic, and syntactic levels. Using Groves and Mundt’s (2015) Model of error taxonomy, the current study evaluated Persian-to-English translations produced by two famous online translators, Google Translate and Bing Translator. A total of 240 texts were randomly selected from different academic fields (law, literature, medicine, and mass media), and 60 texts were considered for each domain. All texts were rendered by the two translation systems and then by four human translators. All statistical analyses were applied using SPSS. The results indicated that Google translations were more accurate than the translations produced by the Bing Translator, especially in the domains of medicine (lexis: 186 vs. 225; semantic: 44 vs. 48; syntactic: 148 vs. 264 errors) and mass media (lexis: 118 vs. 149; semantic: 25 vs. 32; syntactic: 110 vs. 220 errors), respectively. Nonetheless, both machines are reasonably accurate in Persian-to-English translation of lexicons and syntactic structures, particularly from mass media and medical texts.

Keywords: machine translations, accuracy, human translation, efficiency

Procedia PDF Downloads 78
5222 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) for English and machine translation (MT) for English and Croatian in the domain of business correspondence. The first part presents results of training the ASR commercial system on two English data sets, enriched by error analysis. The second part presents results of machine translation performed by online tool Google Translate for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: automatic machine translation, integrated language technologies, quality evaluation, speech recognition

Procedia PDF Downloads 484
5221 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 358
5220 How Is a Machine-Translated Literary Text Organized in Coherence? An Analysis Based upon Theme-Rheme Structure

Authors: Jiang Niu, Yue Jiang

Abstract:

With the ultimate goal to automatically generate translated texts with high quality, machine translation has made tremendous improvements. However, its translations of literary works are still plagued with problems in coherence, esp. the translation between distant language pairs. One of the causes of the problems is probably the lack of linguistic knowledge to be incorporated into the training of machine translation systems. In order to enable readers to better understand the problems of machine translation in coherence, to seek out the potential knowledge to be incorporated, and thus to improve the quality of machine translation products, this study applies Theme-Rheme structure to examine how a machine-translated literary text is organized and developed in terms of coherence. Theme-Rheme structure in Systemic Functional Linguistics is a useful tool for analysis of textual coherence. Theme is the departure point of a clause and Rheme is the rest of the clause. In a text, as Themes and Rhemes may be connected with each other in meaning, they form thematic and rhematic progressions throughout the text. Based on this structure, we can look into how a text is organized and developed in terms of coherence. Methodologically, we chose Chinese and English as the language pair to be studied. Specifically, we built a comparable corpus with two modes of English translations, viz. machine translation (MT) and human translation (HT) of one Chinese literary source text. The translated texts were annotated with Themes, Rhemes and their progressions throughout the texts. The annotated texts were analyzed from two respects, the different types of Themes functioning differently in achieving coherence, and the different types of thematic and rhematic progressions functioning differently in constructing texts. By analyzing and contrasting the two modes of translations, it is found that compared with the HT, 1) the MT features “pseudo-coherence”, with lots of ill-connected fragments of information using “and”; 2) the MT system produces a static and less interconnected text that reads like a list; these two points, in turn, lead to the less coherent organization and development of the MT than that of the HT; 3) novel to traditional and previous studies, Rhemes do contribute to textual connection and coherence though less than Themes do and thus are worthy of notice in further studies. Hence, the findings suggest that Theme-Rheme structure be applied to measuring and assessing the coherence of machine translation, to being incorporated into the training of the machine translation system, and Rheme be taken into account when studying the textual coherence of both MT and HT.

Keywords: coherence, corpus-based, literary translation, machine translation, Theme-Rheme structure

Procedia PDF Downloads 207
5219 Degree in Translation and Years of Professional Experience: Predictors of Translation Quality

Authors: Mohsen Varzande

Abstract:

Translators’ professional and academic characteristics may directly influence their translation quality. The present study aimed at investigating whether translators’ degree in translation and years of professional experience predict their translation quality. Following a causal-comparative study, a sample of one hundred professional translators was selected using purposive sampling method. The participants were divided into two groups each containing individuals with and without a degree in translation, respectively. The participants were asked to translate a paragraph to assess their translation quality. For data analysis, appropriate statistical procedures including correlation and regression were used. Results showed that both degree in translation and years of professional experience significantly predict translation quality. Also, the interaction of translators’ years of professional experience and degree in translation significantly affect their translation quality. An implication could be that besides providing translators with academic knowledge and theories, practical training in translation is necessary as a prerequisite for a competent translator.

Keywords: translation, degree in translation, translation quality, professional experience

Procedia PDF Downloads 432
5218 Design, Analysis and Construction of a 250vac 8amps Arc Welding Machine

Authors: Anthony Okechukwu Ifediniru, Austin Ikechukwu Gbasouzor, Isidore Uche Uju

Abstract:

This article is centered on the design, analysis, construction, and test of a locally made arc welding machine that operates on 250vac with 8 amp output taps ranging from 60vac to 250vac at a fixed frequency, which is of benefit to urban areas; while considering its cost-effectiveness, strength, portability, and mobility. The welding machine uses a power supply to create an electric arc between an electrode and the metal at the welding point. A current selector coil needed for current selection is connected to the primary winding. Electric power is supplied to the primary winding of its transformer and is transferred to the secondary winding by induction. The voltage and current output of the secondary winding are connected to the output terminal, which is used to carry out welding work. The output current of the machine ranges from 110amps for low current welding to 250amps for high current welding. The machine uses a step-down transformer configuration for stepping down the voltage in order to obtain a high current level for effective welding. The welder can adjust the output current within a certain range. This allows the welder to properly set the output current for the type of welding that is being performed. The constructed arc welding machine was tested by connecting the work piece to it. Since there was no shock or spark from the transformer’s laminated core and was successfully used to join metals, it confirmed and validated the design.

Keywords: AC current, arc welding machine, DC current, transformer, welds

Procedia PDF Downloads 181
5217 Comparing Deep Architectures for Selecting Optimal Machine Translation

Authors: Despoina Mouratidis, Katia Lida Kermanidis

Abstract:

Machine translation (MT) is a very important task in Natural Language Processing (NLP). MT evaluation is crucial in MT development, as it constitutes the means to assess the success of an MT system, and also helps improve its performance. Several methods have been proposed for the evaluation of (MT) systems. Some of the most popular ones in automatic MT evaluation are score-based, such as the BLEU score, and others are based on lexical similarity or syntactic similarity between the MT outputs and the reference involving higher-level information like part of speech tagging (POS). This paper presents a language-independent machine learning framework for classifying pairwise translations. This framework uses vector representations of two machine-produced translations, one from a statistical machine translation model (SMT) and one from a neural machine translation model (NMT). The vector representations consist of automatically extracted word embeddings and string-like language-independent features. These vector representations used as an input to a multi-layer neural network (NN) that models the similarity between each MT output and the reference, as well as between the two MT outputs. To evaluate the proposed approach, a professional translation and a "ground-truth" annotation are used. The parallel corpora used are English-Greek (EN-GR) and English-Italian (EN-IT), in the educational domain and of informal genres (video lecture subtitles, course forum text, etc.) that are difficult to be reliably translated. They have tested three basic deep learning (DL) architectures to this schema: (i) fully-connected dense, (ii) Convolutional Neural Network (CNN), and (iii) Long Short-Term Memory (LSTM). Experiments show that all tested architectures achieved better results when compared against those of some of the well-known basic approaches, such as Random Forest (RF) and Support Vector Machine (SVM). Better accuracy results are obtained when LSTM layers are used in our schema. In terms of a balance between the results, better accuracy results are obtained when dense layers are used. The reason for this is that the model correctly classifies more sentences of the minority class (SMT). For a more integrated analysis of the accuracy results, a qualitative linguistic analysis is carried out. In this context, problems have been identified about some figures of speech, as the metaphors, or about certain linguistic phenomena, such as per etymology: paronyms. It is quite interesting to find out why all the classifiers led to worse accuracy results in Italian as compared to Greek, taking into account that the linguistic features employed are language independent.

Keywords: machine learning, machine translation evaluation, neural network architecture, pairwise classification

Procedia PDF Downloads 132
5216 Overview of Resources and Tools to Bridge Language Barriers Provided by the European Union

Authors: Barbara Heinisch, Mikael Snaprud

Abstract:

A common, well understood language is crucial in critical situations like landing a plane. For e-Government solutions, a clear and common language is needed to allow users to successfully complete transactions online. Misunderstandings here may not risk a safe landing but can cause delays, resubmissions and drive costs. This holds also true for higher education, where misunderstandings can also arise due to inconsistent use of terminology. Thus, language barriers are a societal challenge that needs to be tackled. The major means to bridge language barriers is translation. However, achieving high-quality translation and making texts understandable and accessible require certain framework conditions. Therefore, the EU and individual projects take (strategic) actions. These actions include the identification, collection, processing, re-use and development of language resources. These language resources may be used for the development of machine translation systems and the provision of (public) services including higher education. This paper outlines some of the existing resources and indicate directions for further development to increase the quality and usage of these resources.

Keywords: language resources, machine translation, terminology, translation

Procedia PDF Downloads 319