Post editing machine translation: the views of the experts

Ramping up to his role as moderator in the forthcoming Proz's post-editing debate on 24th September, Jeff Allen (Engineering Tools Integration Expert) from SAP, exchanged views with Pangeanic's Manuel Herranz as practitioners and implementors of machine translation solutions. Jeff has been a champion of machine translation and post-editing for decades, with hands-on experience with practically every MT technology, rule-based, knowledge-based and statistical machine translation. His practical approach to language technologies and interest in humanitarian causes led him to deploy a first publicly-available Creole MT solution during the Haiti crisis in 2010. (Click here to see video on how Jeff was able to create a basic machine translation for aid relief system even with little data The successes and challenges of making low-data languages available in online automatic translation portals and software.) It looks like MT has become a "must-have" technology for all language companies. Some, like Pangeanic, decided to make use of open-source Moses to develop its own flexible and modular systems. Looking back at the years of development, off-the-box solutions and customized solutions, Jeff's views are clarifying in several ways. "Systems have now become ready for mass consumption. In the public-facing arena, we dealt with non-customized systems for decades and this, in part, gave post-editing and the whole machine translation experience a bad press". Regarding the buzz, frenzy and hype about machine translation nowadays, Jeff thinks that "perhaps we started too early marketing it, whatever technology we look at, SMT, rule-based MT, post-editing, building dictionaries ... we just could not wait for the market to mature with the need but for a moment in time. Now, the technology has become visible, its strengths and viability can be proven". However, the same danger and the same mistake seems to be happening now with post-editing as it once did with machine-translation, and even with translation memory a decade before: lack of customization and preparation. "Google Translate has become the reference for post-editing, and MS Word the tool. That's it. And that's terrible. Both lack the functionalities that can make the whole MT experience successful. The engine is not customized but a generalist (a mistake that builds hopes high with "ready made" systems). There is no chance of pre-processing formats, tags, not to make a my terminology prevail. The same with Word - little can be done to spot errors in consistency, terminology, moving the words around, etc., apart from search and replace. Thus, translators keep referring to the cheap post-editing jobs being offered in marketplaces, which make things sound as "do it as before but cheaper". The problem is that there are few specialists capable to make systems fully customized. My advice is that the same logic that applies to a translation job offer should apply to a post-editing job offer. Just ask the same questions you would ask to an LSP offering you a translation job and you will soon know if the person/company knows what they are doing. If they know the client, the text and have prepared good TMs and glossaries, etc, the project manager will soon give a clear and quick answer and the information you need. The same with a post-editing job: if the company has trained the engines, done the homework customizing dictionaries and applying terminology and is offering you information about clear post-editing instructions, then you know they are applying machine translation technology well and the post-editing effort and compensation will be fair." In short, and with years of translation industry behind them, Jeff Allen and Manuel Herranz have a perspective on translator resistance to the technology. It is not so dissimilar to the resistance towards Translation Memory systems in the 1990’s. Eventually, the technological change brought about by machine translation will benefit users and translation consumers as a whole, making translation more and more ubiquitous. What we need, is clear guidelines, scoring systems, and more experts… as it has happened with TM systems.

Next time you think languages, think Pangeanic

Your Machine Translation Customization Solutions