TAUS Tokyo Summit: improvements in japanese neural machine translation are real

Not that business plans are written in stone any longer, but efforts to provide an insight by experts are always welcome. TAUS Tokyo Summit provided a much awaited for set of good news about perceived human translation improvements in neural machine translation in Japanese. English-Japanese was a well-known difficult language pair for rule-based machine translation and statistical machine translation provided a really awful experience for many Japanese audiences. It has historically been one of the hardest language combinations to automate. It seems that neural machine translation may be the answer.

Day 1 – Where is the translation industry heading?

Jaap began by summarizing the latest meeting of thought leaders in Amsterdam who met in Amsterdam in order to brainstorm a potential landscape and priorities for the language industry in the five years. If machine translation hype was at its peak five years ago with statistical machine translation and all sort of hybrids, we are now beginning to experience the neural MT hype. But adopters and developers are much wiser. If data was king some years ago, it seems we may not need so much in the future. Datafication was a process started some years ago after an article called “The Unreasonable Effectiveness of Data” (Elon Halevy, Peter Norving, Fernando Pereira, 2010, Google). The article said that the more data the better if our aim was to collect data to train machine translation engines and models. The more data we had to teach the algorithms decide what was best, the better a statistical system would translate. The problem has always been the unclarity about copyright issues with translation data. For example, law is different between US and Europe with regards to translation ownership. TAUS has been focusing in the development of tools and practical services to the translation industry it serves, such as

Machine Learning
Quality Dashboard
Machine Translation
Intelligent TM
Interoperability, etc.

The set of services and tools (such as DQF) may soon become industry standards and they can be used to benchmark and measure productivity in-house and also with other (anonymized) players. DQF is now available as an API and can collect data real time as translators work, without disturb them. It is a transparent model and reports can be tracked to track reports, statistics and benchmark against other translators. Jaap mentioned that Europeans are very worried that Google and Microsoft to “fix the problem” and be left out of the language technology race, referring to one of his previous articles “The Brains but not the Guts”. Europe is exporting talent to the US, an army of language scientists who are helping those two giants overcome the language barrier. On the other hand, machine translation has been accepted, it is becoming an API. On a daily basis, output from machines is 500 times bigger than the output from all professional translators put together. The translation industry is growing but also changing radically. What companies do nowadays is not pure translation any longer but telemanagement, post-editing, transcreation services, project management crowdsourcing, telemarketing, etc. Translation is datafied. We want to know everything happening in a translator’s environment so we can accurately measure how many segments are translated, or words per hour. Eye movement tracking and word suggestions have been around academia for some time but they have now crossed the barrier to commercial MT services. We even track translators’ social graphs, how the weather or news affect the translator, third party applications, how much leveraging from previous translations was used. All that information can help us to automate project management more and improve resource allocation. We are moving to a future where project management will also be automated. An interesting parallel was drawn between industries when Jaap mentioned that food delivery people do not have a boss, they have an app. All they are interested in is where to pick up the food and where to deliver it. And that’s a kind of post-editor. Translation buyers are finding that some vendors send out their jobs out to the internet and freelancer translators do general machine translation and post-edit it. “I only had to do some minor fixes”, said one PM from a leading translation company. The fear is “how long until my client finds out he can do the same?”, that is how long until translation buyers find out they can post jobs on the internet (via an app, maybe) and pay post-editing rates to cut out project management fees? In short, will everything handled by robots in the near future? Pay-as-you-go models may change and users will become more active with the management of terminology, labelling, etc. The representative from Athena Parthenos created some controversy by stating that creativity will help the industry survive as creativity is the realm of humans. Mark Seligman agreed as he said what machine translation cannot do is convey the emotions of humans, which is what marketing is all about. Chris Wendt, from Microsoft disagreed: “I have seen very creative neural translations”. Another possibility, according to Jaap was that post-editing will not longer be needed, there will be people behind dashboards and people doing the creative jobs.

Day 2 – Neural machine translation has cracked the language barrier in Japanese

But the juicy news came on Day 2. Presentations from Systran, Pangeanic and Google provided news about development of neural networks applied to machine translation with a particular accent on improvements in neural machine translation in Japanese, with Human Science reporting on post-editing from Google’s NMT API . Consensus run on neural machine translation producing more natural and fluent output than phrase-based MT. However, there are problems, too. Neural machine translation can produce unreliable output when confronted with unusual input or when a strictly literal rendering is desired. On the plus side, neural machine translation seems to be highly adaptable and it has the potential of being applied to other natural language tasks. SDL presented UpLift, a technique similar to their old concordance check which combines words and small subsegment units which reminds me a lot of an old technique by Dejà Vu and Transit in the past. The difference now is that it is automatic, it is applied to all words in a sentence and shows translation. The back technology is the creation of a glossary “behind” the TM. This is done by creating an index (at the end of the day, when the PC is not used, according to their own recommendation). This is combined with syntactic analysis for Asian languages. The new version “repairs” fuzzy matched automatically if the difference is only a word or two (a feature also offered by our own ActivaTM). I found it striking to learn that SDL finds that people do not bother to re-use and re-train their own engines once created. Its automated training system has not been so successful (perhaps because of data privacy issues, since SDL is, at the end of the day, another LSP). Mark Seligman gave an overview of speech to speech translation, particularly from a Japanese (and in general Asian) perspective with the first speech-to-speech product by LinguaTec (currently Lingenio) to 2017. Most of these products were ahead of their times. NEC had an Japanese-English, but the real watershed came with the app of Google Translate which gave birth to a speech translation. Jibbigo was happening in Europe at the time, too. Sony had one phrase-based app and Phraselater used by the US military. Mark provided an impressive speech-to-speech live Japanese-English translation over his app, SpeechTrans, and stated that “Google-type glasses” with subtitles or similar technology would be available in 3 years not 300. Systran's presentation provided a lot of information about their Open NMT initiative and how they have created a community á la Moses. I would like to write more about the value of this worthy initiative and how it may become a very significant force in a post-Moses world, although SMT systems will have life for some time. The better outputs provided by neural machine translation in Japanese have prompted a kind of fever and much higher acceptance levels as phrase-based systems behaved with a higher degree of predictability with close language pairs. Morphologically-rich languages such as the Slavic family also proved notoriously hard to automate. Our presentation offered information on our first results on engines built with identical datasets in French, German, Italian, Spanish, Portuguese, Russian and Japanese but using an SMT system and a neural network, with astonishing results. Systems built with identical data but in a different way (statistical versus NMT) provided rankings of "human quality" "almost human quality" in 80%-90% of the 250 sentences tested, including Russian. The improvements in neural machine translation in Japanese are real.

A copy of our presentation and results is available in slideshare https://www.slideshare.net/manuelherranz/pangeanic-coractivatmneural-machine-translation-taus-tokyo-2017 As Mark had previously done with a speech-to-speech system, Microsoft’s Chris Wendt provided a live test of his speech translator starting with the Star Trek sample (an alien and a human speaking to each other with a different device). The audience had to keep quiet so noise did not have an impact on the translation. Speech translation had been inspired by science fiction, yes, but it was now a reality (the same happened to Jules Verne submarines, Around the World in 80 Days, etc…) Microsoft’s neural network can accent English from non-native speakers as input. It works with Indian, French or Spanish accents, but it is not so good with strong German or Russian accents. He introduced TRUETEXT for cases where there are hesitations by actually saying what you are trying to say without hesitation, stops, etc., so that the input is more prone for machine learning.

There are many potential uses of multilingual speech-to-speech technology: multilingual meetings, schools in the US and situations where there is one speaker and many are listening. I wonder if this may create an audience of “lazy” language learners? People asked questions to Chris in Japanese, Italian, Chinese (verbally) and Chris replied in English, which was shown in each language on the monitor. He then switched to his native German (switching the language settings in the device) and translation was provided as written text on the monitors. He still received questions in Singaporean Chinese but now the system was translating from his German into Japanese and Chinese. The system slowed down a little bit, but the leap was also great with a lot of people asking questions. Chris stated that English-Spanish is the best working combination as they are syntactically similar languages and there is also a lot of training material. The last presentation was from Google’s Macduff Hughes, who began by addressing an audience who had already been convinced on the superiority of neural networks for Japanese English translation. “Last year NMT was a rumor, 6 months it was the beginning, and now it is here”. Hughes took Spanish as an example of one of the best language pairs and analyzed how much better and fluent neural machine translation was in comparison to phrase-based. Gender was wrong because of length in SMT in several instances, but as neural absorbs the whole sentence, it neural fixes a lot of the small annoying errors in Spanish, though not all the time. GNMT is not ready to handle tags yet (in fact no neural system can yet). Moderate amounts of in-domain data can adapt a model. The challenge is that it can be hard to evaluate, and also automatic training, stopping and scoring. So iin this respect, there is a lot of good work that has already been done in statistical systems that cannot be imported into neural networks so easily – a conundrum faced by all MT developers. Interestingly, Hughes pointed to experiments that prove that source sentences meaning more or less the same thing can produce similar results, which points to the fact that a kind of interlingua has been developed. Knowledge can be transferred to chat or other Neural Networks understandings. But interlingua is another story...