Understanding Machine Translation Customization and DIY MT

by Manuel Herranz

The same mistake that was made by many translation agencies, translation companies and now language service providers is being made by tough machine translation companies. “My (machine) translation is better than yours”, “my machine translation system works, everybody else’s doesn’t”. Translation companies have learnt that they cannot sell translation services on “translation quality claims” only or “I am better than you because…”  – but it seems that some machine translation companies have to learn the same lesson. I am referring particularly to those with risky levels of investment /venture capital to repay and without the testing ground of in-house native speakers or a real translation department where to test their technologies and MT before release. At times, such companies obtained their “high quality clean data” by bombarding Google Translate and applying cleaning cycles which included manual revision by local, non-native graduates. Many LSPs fall for the big marketing campaigns, strong wordings – the limelight is always very attractive. Translation Memory technologies are a good proof of that.

Bad-mouthing the competition is the worst marketing tool I would recommend to anybody in sales, marketing or representing a company. Talk about your strengths. Acknowledge what you cannot do but what you can do to solve the problem. If you cannot match some offerings from the competition, saying it doesn’t work is a terrible policy. There are tens of use cases and applications, conferences, presentations to prove that, for example DIY MT works and is in good health, being used at LSPs, institutions and corporations. As far as I know, automated retraining and Moses packaging are part of at least two EU-funded programs. As platforms such as Gala provide an excellent platform for machine translation webminars, monopolistic attitudes become more and more aggressive.

But I want to minimize self-promotion. What Kirti Vashee seems to forget in his virulent blog entries is that no company will release a tool that doesn’t work nor install a product that cannot do what it claims it can. I was an industrial engineer for many years to learn at least the difference between what works and doesn’t work. When it comes to hardware tools, quality may be easy to spot. When it comes to services (and in machine translation is clear, “my output” “my clients” “my productivity” and “my technological independence”) quality is what works best for me. Claiming that in 2013 MT is so complex only one company fully understands it, is presumptuous to say the least.

Let me quote some translation agencies (the term Language Service Provider being unknown to the majority of people outside the language industry). They are not big companies, possibly what economists call small and medium-size companies.

Tilde, Apsic, Lexcelera, Pangeanic. I am sure other four at least could make it to this list. What do these companies have in common? All of them were/are  translation companies that have transformed themselves into higher solution providers either by developing software solutions that solved particular problems in translation or by customizing technology into their processes. With the help of EU funds and a clear vision to fill a market need, Tilde led R&D projects aimed at developing machine translation for less-resourced languages. Automated engine creation and re-training were part of the initial EU-funded project.

Apsic is the developer of one of the best consistency-checking software (XBench) which is a must of any company wanting to ensure terminology consistency and error-free supplies over hundreds of files.

Pangeanic has developed a management system on top of Moses which manages training sets and automatically cleans some data, trains engines and creates new engines with a variety of other customizable features.

As MT customizers, we know that initially some settings, parameters, weighs and features need to be configured carefully to get a good start. But I do not know of any company in the software business that insists on manual processes and cannot automate what it has to do repetitively.

Next time you think languages, think Pangeanic
Your Machine Translation Customization Solutions


One thought on “Understanding Machine Translation Customization and DIY MT

  1. K Vashee (@kvashee)

    My primary point in my blog posting is that expertise, long-term experience and a real understanding of how the technology works is necessary and critical to get the best results. Most DIY users do not have these characteristics and thus are very likely to get much lower quality results. Pointing this out, to my mind is not equivalent to “bad mouthing competition”, I am simply comparing approaches and pointing out the value implications.

    Also while I claim that expertise does matter, I do not suggest that Asia Online is the only company with this expertise. There are several other MT experts including RbMT developers like Systran and a specialist like Tayou in Spain.

    I do believe that MT technology is complex enough that it does require specialization, and that developing real competence with MT is difficult enough that it is unlikely to be successfully done by a company whose primary business is being a translation agency. It is clear that you disagree. I am also pointing out that the value received by a customer is very likely to be lower for a DIY user. I can understand that you may have a different opinion to mine and assure you that my observations are not borne of virulence.

    Historically we saw many LSPs develop their own TMS systems too but most people in the industry would concede that the best TMS systems have come from companies that focus and specialize in the development of these tools e.g. MemoQ, Memsource, Across, XTM etc.. We have also seen the SDL acquisitions of software companies like Idiom, LW and Trados result in what most perceive as reduced customer responsiveness, quality and commitment to these products. Buying critical production infrastructure from a competitor generally does not make sense in any industry and thus we have seen the momentum slow down on all the SDL software acquisitions.

    Anyway, I wish you peace and health.



Leave a Reply

Your email address will not be published. Required fields are marked *