But the juicy news came on Day 2. Presentations from Systran, Pangeanic and Google provided news about development of neural networks applied to machine translation with a particular accent on improvements in neural machine translation in Japanese, with Human Science reporting on post-editing from Google’s NMT API . Consensus run on neural machine translation producing more natural and fluent output than phrase-based MT. However, there are problems, too. Neural machine translation can produce unreliable output when confronted with unusual input or when a strictly literal rendering is desired. On the plus side, neural machine translation seems to be highly adaptable and it has the potential of being applied to other natural language tasks. SDL presented UpLift, a technique similar to their old concordance check which combines words and small subsegment units which reminds me a lot of an old technique by Dejà Vu and Transit in the past. The difference now is that it is automatic, it is applied to all words in a sentence and shows translation. The back technology is the creation of a glossary “behind” the TM. This is done by creating an index (at the end of the day, when the PC is not used, according to their own recommendation). This is combined with syntactic analysis for Asian languages. The new version “repairs” fuzzy matched automatically if the difference is only a word or two (a feature also offered by our own ActivaTM). I found it striking to learn that SDL finds that people do not bother to re-use and re-train their own engines once created. Its automated training system has not been so successful (perhaps because of data privacy issues, since SDL is, at the end of the day, another LSP). Mark Seligman gave an overview of speech to speech translation, particularly from a Japanese (and in general Asian) perspective with the first speech-to-speech product by LinguaTec (currently Lingenio) to 2017. Most of these products were ahead of their times. NEC had an Japanese-English, but the real watershed came with the app of Google Translate which gave birth to a speech translation. Jibbigo was happening in Europe at the time, too. Sony had one phrase-based app and Phraselater used by the US military. Mark provided an impressive speech-to-speech live Japanese-English translation over his app, SpeechTrans, and stated that “Google-type glasses” with subtitles or similar technology would be available in 3 years not 300. Systran's presentation provided a lot of information about their Open NMT initiative and how they have created a community á la Moses. I would like to write more about the value of this worthy initiative and how it may become a very significant force in a post-Moses world, although SMT systems will have life for some time. The better outputs provided by neural machine translation in Japanese have prompted a kind of fever and much higher acceptance levels as phrase-based systems behaved with a higher degree of predictability with close language pairs. Morphologically-rich languages such as the Slavic family also proved notoriously hard to automate. Our presentation offered information on our first results on engines built with identical datasets in French, German, Italian, Spanish, Portuguese, Russian and Japanese but using an SMT system and a neural network, with astonishing results. Systems built with identical data but in a different way (statistical versus NMT) provided rankings of "human quality" "almost human quality" in 80%-90% of the 250 sentences tested, including Russian. The improvements in neural machine translation in Japanese are real.
A copy of our presentation and results is available in slideshare https://www.slideshare.net/manuelherranz/pangeanic-coractivatmneural-machine-translation-taus-tokyo-2017 As Mark had previously done with a speech-to-speech system, Microsoft’s Chris Wendt provided a live test of his speech translator starting with the Star Trek sample (an alien and a human speaking to each other with a different device). The audience had to keep quiet so noise did not have an impact on the translation. Speech translation had been inspired by science fiction, yes, but it was now a reality (the same happened to Jules Verne submarines, Around the World in 80 Days, etc…) Microsoft’s neural network can accent English from non-native speakers as input. It works with Indian, French or Spanish accents, but it is not so good with strong German or Russian accents. He introduced TRUETEXT for cases where there are hesitations by actually saying what you are trying to say without hesitation, stops, etc., so that the input is more prone for machine learning.
There are many potential uses of multilingual speech-to-speech technology: multilingual meetings, schools in the US and situations where there is one speaker and many are listening. I wonder if this may create an audience of “lazy” language learners? People asked questions to Chris in Japanese, Italian, Chinese (verbally) and Chris replied in English, which was shown in each language on the monitor. He then switched to his native German (switching the language settings in the device) and translation was provided as written text on the monitors. He still received questions in Singaporean Chinese but now the system was translating from his German into Japanese and Chinese. The system slowed down a little bit, but the leap was also great with a lot of people asking questions. Chris stated that English-Spanish is the best working combination as they are syntactically similar languages and there is also a lot of training material. The last presentation was from Google’s Macduff Hughes, who began by addressing an audience who had already been convinced on the superiority of neural networks for Japanese English translation. “Last year NMT was a rumor, 6 months it was the beginning, and now it is here”. Hughes took Spanish as an example of one of the best language pairs and analyzed how much better and fluent neural machine translation was in comparison to phrase-based. Gender was wrong because of length in SMT in several instances, but as neural absorbs the whole sentence, it neural fixes a lot of the small annoying errors in Spanish, though not all the time. GNMT is not ready to handle tags yet (in fact no neural system can yet). Moderate amounts of in-domain data can adapt a model. The challenge is that it can be hard to evaluate, and also automatic training, stopping and scoring. So iin this respect, there is a lot of good work that has already been done in statistical systems that cannot be imported into neural networks so easily – a conundrum faced by all MT developers. Interestingly, Hughes pointed to experiments that prove that source sentences meaning more or less the same thing can produce similar results, which points to the fact that a kind of interlingua has been developed. Knowledge can be transferred to chat or other Neural Networks understandings. But interlingua is another story...