6 min read

19/09/2022

Where are we at with Neural Machine Translation?

What is Neural Machine Transaltion?

Neural Machine Translation (NMT) is the new approach to machine translation. NMT works with an end-to-end architecture that aims to train all the components simultaneously to maximize its performance. The architecture takes into account the full sentence as a context, which enables it to achieve a fluent translation.

Content

1. What is Neural Machine Translation?

2. The Rise of Neural Machine Translation

3.How does neural machine translation work?

4. Which Type of Machine Translation do I need?

5. Has Neural Machine Translation Achieved Human Parity?

6. Is neural machine translation useful to translate literary texts?

7.The quality of neural machine translation

8. The main advantages of Neural Machine Translation

9. Technological advances in Neural Machine Translation

What is Neural Machine Translation?

The Rise of Neural Machine Translation

The evolution of neural machine translation (NMT) from its origins to today: How did NMT become a game changer for doing business globally?

Little did he know that 120 years later, during the 1954 Georgetown-IBM experiment, New York would witness the first demonstration of an automatic language translation machine that converted brief statements about fields such as politics, law, chemistry, and military affairs from Russian into English.

Machine translation (MT), over the course of its development, has changed greatly—from systems that required hours and days of computing time to produce a translation of dubious quality, to the current neural machine translation (NMT) systems that can process the same content in mere seconds and with much more accuracy.

How does neural machine translation work?

Neural machine translation uses neural networks to translate a text from the source language to the target language. These networks can handle very large data sets and require little supervision. There are two types of neural networks in translation systems: an encoder network and a decoder network.

What do we mean by neural network?

When there is an interconnected series of nodes modeled on the human brain, we call it a neural network. With this system, incoming information passes through the nodes and then goes out again. This structure is called a "sequence-to-sequence neural network" (Seq2Seq). It works by observing a sentence in the source language and producing a sentence in the corresponding target language.

What kind of neural machine translation do I need?

Neural machine translation quality can be improved by including proofreading and editing by human translators after the first step of computer processing. This allows your final translation to be 100% accurate and reliable, with a human touch that a machine cannot provide.

We offer two types of editing services for our machine translation projects:

Customized machine translation: If you need to translate a large amount of data and you are going to use a lot of machine translation, we recommend training the neural engines from scratch. We can extract data from your field to create a specialized engine covering many linguistic areas.
Deep Adaptive machine translation: Part of any of our general engines that cover most areas of expertise. This type of translation is highly recommended for language service providers and corporate clients at peak production times or on a long-term basis.

Has Neural Machine Translation Achieved Human Parity?

Recently, Google, Microsoft, and SDL have argued that Neural Machine Translation (NMT) has achieved human translation parity with “ Google’s Neural Machine Translation System: Bridging the gap between human and machine translation”, “ Achieving human parity on automatic Chinese to English news translation” and “SDL cracks Russian-to-English translation” respectively.

In a recent work just accepted in EMNLP 2018 conference, experiments comparing neural machine translations with human translations are being conducted. The task consists of ranking 55 documents and 120 sentences from the WMT 2017 Chinese–English test set. The documents and sentences are evaluated in monolingual (only target language text) and bilingual (both source and target language text) conditions. The raters are professional translators with at least three years of experience and boast positive client reviews.

For the monolingual condition, they recruited 5 translators native in English, whilst for the bilingual condition, they recruited 2 translators native in Chinese, 1 translator native in English and 1 translator native in both English and Chinese. In the monolingual condition, translators preferred the human-produced text over the machine-produced text in terms of the sentences as well as the documents. In the bilingual condition, the translators' ratings demonstrated a significant preference for human translation over machine translation when evaluating documents.

neural machine translation

However, when evaluating isolated sentences, machine translation achieves parity to human showing no preference. This is undoubtedly a good finding. NMT quality is impressive but there are two important aspects to consider.

The first one is that authors are wary to conclude that the results could make us think that MT performs better in adequacy than fluency. Nevertheless, MT evaluation can probably be more favorable when the majority of translators are native in the source language.

The second one is that evaluating at a sentence level can be insufficient as textual, cultural and other contexts are unknown and these elements have to be taken into account in order to really understand the translation. These findings confirm the necessity to continue researching at document level as recent works.

By augmenting the context to document level, machine translation will be able to improve coherence and cohesion of the translated text. Document-level NMT can avoid some errors that at sentence level are impossible to recognize like gender concordance across the sentences.

Keep reading: Language, the Basis of Neural Machine Translation

Is neural machine translation useful to translate literary texts?

The market of literature translation is growing due to the use of electronic books. In the last years, the sales of electronic books have doubled worldwide. Nowadays, it is easier to read a book on any device or even listen to audiobooks.

Translation is obviously growing in this market as well. However, translating literary texts requires creativity that machines cannot afford, for example facing untranslatability, metaphors or idioms. This is the most challenging scenario for machine translation. In spite of the improvement of translation performance using Neural Machine Translation (NMT) due to taking into account the sentence as a context, literary texts are still difficult to automatically translate.

In order to know how far we can progress with machine translation in literature domain, in this work presented by Dr. Antonio Toral and Prof. Andy Way, 12 novels are translated from English to Catalan with NMT systems:

Auster’s Sunset Park (2010)
Collins’ Hunger Games #3 (2010)
Golding’s Lord of the Flies (1954)
Hemingway’s The Old Man and the Sea (1952)
Highsmith’s Ripley Under Water (1991)
Hosseini’s A Thousand Splendid Suns (2007)
Joyce’s Ulysses (1922)
Kerouac’s On the Road (1957)
Orwell’s 1984 (1949)
Rowling’s Harry Potter #7 (2007)
Salinger’s The Catcher in the Rye (1951)
Tolkien’s The Lord of the Rings #3 (1955)

English and Catalan -coming from different families- were chosen in order to make the task more challenging. Also, Catalan is a mid-size European language, which means that there are available resources to train a system but not as much as other major European languages like Spanish, French, German or Italian.

The NMT system was trained with 133 novels translated from English to Catalan and 1000 books written in Catalan. The translations of 3 books were manually ranked by native Catalan speakers comparing human translation to NMT. For 2 books, NMT system obtained equivalent quality to human translations in around a third of the cases.

The quality of neural machine translation

The quality of neural machine translation depends on a large number of factors, regardless of the tool chosen. The language pair, the amount of training data, and even the volume and type of texts to be translated must also be taken into account. The more translations a model performs for a specific domain and language, the better the quality of the final translations.

Over the years, machine translation has increased in popularity and, as a consequence, it has been necessary to research and improve the technology. Being aware of the available tools and knowing which one is the best for the type of translation you need is essential in order to achieve optimal quality.

At Pangeanic, we have near-human quality machine translation for different usages. Our years of experience in the translation industry has provided us with sufficient training data to enable our engines to deliver quality translations of large quantities of documents in record time.

The main advantages of neural machine translation

Neural machine translation offers many advantages for businesses, as it allows large amounts of text to be translated into different languages in a reduced time frame, which is essential in the digital age of immediacy.

The first machine translation tools revolutionized the market, but with the arrival of neural machine translation based on neural models, the field of computer-assisted translation has been completely transformed, giving rise to a more accurate and interesting tool for those who require it.

Some of the benefits of this tool include:

Accurate translations: It is based on increasingly large data sets and, by using linguistic modeling, neural machine translation engines are able to contextualize words and phrases for accurate and smooth translation work.

Fast learning: Neural networks can be trained quickly using automated processes.

Easy and flexible integration: One benefit of using this tool is that it can be integrated via API and SDK into any software and can be applied to many content file formats.

It is customizable: Depending on the content to be translated, the model can be updated to adapt it to consumer demand through terminology databases, specific glossaries and other data sources to improve results.

Cost-effective: Human translation is time-consuming and can be costly, especially in projects involving large numbers of words and languages. Neural machine translation makes it possible to produce translations at a fraction of the cost, and you can always rely on human translators to take care of machine translation post-editing.

It is scalable: When translation processes need to be expanded, neural machine translation makes it possible to meet the increased demand, easily and quickly.

neural machine translation

Technological advances in Neural Machine Translation

Technology has improved the machine translation performance in this domain but it is still a low rate, so it requires many efforts of human reviewing as mentioned in a previous post. Authors are planning to investigate if NMT can be useful to assist human translators in the translation of literary text measuring the effort and quality.

New approaches and data collection will improve these results. There is a lot of research going on to achieve a competitive rate in the literature domain. One day, machine translation will be ready for that, but it will take some time.

Amando Estela

09/19/22