What Are the Main Advantages and Disadvantages of Machine Translation | Pangeanic

Written by Manuel Herranz | 08/23/09

What is machine translation?

Machine translation refers to the use of intelligent software and technology to translate texts from one language to another without the need for human intervention. Originally, these systems were based on rules that used grammatical structures and bilingual dictionaries.

Over time, and in order to achieve a faster and more accurate process, statistical systems were developed. These were based on the analysis of enormous amounts of human translations and the subsequent use of statistical methods to generate a translation result.

Today, neuronal translation systems are based on neural networks and deep learning to achieve an automated continuous learning model and very high quality results.

In fact, a study published last month by CSA Research predicts that current trends in artificial intelligence development will result in machine translation that can fully understand and adapt translations in a receptive way based on context, metadata and usage scenarios.


You might be interested in: Everything you need to know about machine translation

Keep reading to find out what the advantages and disadvantages of machine translation are!

Advantages and disadvantages of machine translation

In this article, we analyze the advantages and disadvantages of machine translation, as well as its place in the current and future localization market.

The advantages of machine translation

Accuracy: Machine translation is more accurate than human translation. This is because it is processed by an algorithm which was specifically designed to translate text and is not affected by the translator’s mood or language proficiency. The result is a translation that comes closer to the original meaning of the source text.

Speed: Machine translation can process large amounts of text in a very short amount of time, making it ideal for translating documents such as websites, articles and books.

For example, Google Translate translates between 103 languages in less than 0.1 seconds (the same speed as speaking) and Facebook uses MT technology to translate its 300 million words of posts every day into more than 100 languages at once.

Versatility: While many machine translation engines have problems with context or vocabulary usage, they do have advantages when translating between languages that don’t share common roots or are not commonly spoken today (such as ancient texts).

This is because MT engines are not constrained by human limitations – they can process words and grammar that they have not yet learned or used. They also have the ability to adapt to changes in usage over time, which means they can improve their translations as more input data is provided.

For example, Google Translate uses a recurrent neural network model that can “learn” new words as they are encountered in different contexts. This means it can keep up with the ever-changing nature of language, which is important for translating ancient texts or dialects that aren’t commonly spoken today.

Privacy: Remote working due to COVID-19 has increased the costs related to security breaches, at USD 1.07 million, according to an IBM report published this year.

An automated translation process involves fewer contact points and a shorter transit time, so fewer people gain access to private data. As a result, machine translation provides organizations with greater control on data governance.

In addition, some machine translation tools, such as anonymization, allow sensitive or confidential content to be translated in a way that preserves personal data and protects the company's privacy and reputation.

Flexibility and customization: Deep analysis tools can use drafts that provide machine translation engines with previously used phrases and concepts, taking advantage of a “translation memory” that can be customized for each specific user.

Depending on the exact content or its priority, machine translation can perform different quality controls, such as glossary compliance or numerical consistency checks.

These options are just examples of the flexibility and customization that machine translation provides for each individual organization.

Want to know more about the advantages and disadvantages of machine translation? Keep reading!

 

The disadvantages of machine translation

While it's true that machine translation has come a long way and is getting better, it still has its limitations. For example:


  • Machine translation can be pretty bad at translating poetry.

  • Machine translation can be pretty bad at translating jokes, especially if the joke requires knowledge of local customs or history.

  • Machine translation can be pretty bad at translating idioms, because an idiom is an expression whose meaning depends on its use within the language system of which it is a part (e.g., "spill your guts"). 

  • Idioms often have no direct equivalent in another language, so there's no way for a computer program to know what they mean or how to translate them—at least not yet! But even when idioms do have an equivalent in another language (as with "spill your guts"), computers still have trouble putting together a coherent sentence because they can't understand figurative meanings or connotations like sarcasm or irony very well (yet!).

How are you enjoying the main advantages and disadvantages of machine translation?

Real cases demonstrating the advantages and disadvantages of machine translation

We spoke with Manuel Herranz in order to observe the advantages and disadvantages of machine translation in a real-life situation.

The problem

I recently received an email from a company I am trying to introduce to the advantages of machine translation. They deal mostly with a closed environment and the source language is English (though poor at times). They are a perfect candidate for automation as they deal mostly with user manuals and controlled documentation

The comment in question was:


“I understand the need to share TMs for translation, however, I still have doubts about machine translation as the quality cannot yet apply to real jobs when dealing with the Japanese language. Even though the technology is highly developed among European languages, people like us who do not understand European languages still worry about the quality, as we cannot judge it. This is my honest impression. It sounds like a simple question, but it is a very important one!”

 

The solution

Creating a solution that works for everything is out of the question (for now). Many have tried to climb up that mountain only to fall in the attempt. Recently, Google has attempted such a solution with its Google Translator Toolkit, and it works to a certain extent. It is particularly useful for general information and gisting. I do know it has become a reference tool for many linguists (novices with lack of knowledge and experts whose brains are too full of information or just can't remember). It is much quicker to ask GT than to check terminology in the EU's official IATE website, for example.


The expert’s opinion

GT works quite well if you are trying to translate an EULA, for example. The quality is rather good, as it has plenty of material aligned from companies' websites. For many other areas, the results vary from "good enough" (i.e. usable with some post-editing) to "gisting" and purely unacceptable. GT has been extremely valuable to me when I have needed to know what was being said in certain documents in Polish, Chinese, Russian or Japanese. It wasn't a professional translation, but hey, it was free, and most importantly it was there when I needed it and served an invaluable gisting purpose. It was the difference between not knowing and knowing, even if the quality was poor or mechanically formulated.





Serious machine translation is a rather different concept. I favor statistical to rule-based for many reasons. SMT (statistical machine translation) is based on the concept of logic and maths. Based on the fact that a language normally has between 10.000 basic words (as in German, the rest are compound words) and around 30.000 (the vast majority), one can guess that with a 2M word corpus everything that language has to say has already been said. This is not so, as there are numerous repetitions, changes in meaning, technical words, set expressions, etc. One can reach 2M without conjugating every verb in all its forms. But a large chunk will be there; at least the words we use for 90% of our daily communications. You can build on this to create a model with which a machine can compute matches.


More information:What Is Statistical Machine Translation?

 


 

If you reduce the scope of your expectations (I only want electronics/ automotive/ legal/ agricultural/ physics domains) then you can be more precise. You need texts that deal with each domain and can, for example, disregard words and texts that deal with "butter," "international relations," "motorbike instructions," "coffee" or "fishing rights" when constructing a model for electronics. "Motorbike instructions" would be fine for a computer-based model dealing with engineering, for example. Moses, the best state-of-the-art, open source engine resolves likely possibilities of a given source word being a target word X by applying a set of equations, which match word occurrence (the number of times a foreign word X appears every time source word Y does). It works wonderfully. And the more source material is used during the training, the better. 



Keep reading:Techniques for Measuring Machine Translation Quality


My colleague (not new to translation, but fairly uncomfortable with the concept of machine translation) mentioned regulating (controlling) the way the source material behaves. This can work well and it is true for rule-based models (Systran) which work on the basis that A is always 1, B is always 2, C is 3, and therefore AB must be 12 and CB 31, etc. Controlling the input helps with statistical models, but it is not a necessity as millions of words are computed every time, in every sentence, i.e., the "correspondence" equations are applied to each sentence throughout the 2M-3M-4M-5M-word corpus. This is how Google Translate behaves and how our machine is learning to behave within specific language domains.

Advantages and disadvantages of machine translation: conclusions

In short, the advantages and disadvantages of machine translation are solved by incorporating a human point of contact into the translation and localization workflow. Translators are essential when it comes to those cultural nuances, brand cohesion and grammatical errors that machines cannot understand. They add that touch of localization that will help companies maintain the consistency and integrity of their brand while expanding around the world

Pangeanic's approach combines the highest levels of flexibility, control, customization and customer service with state-of-the-art technology to provide accurate, near-human machine translation. Want to try our technology? Talk to us or, better yet, request a demo.