PangeaMT Webinar on Translation Customization and DIY MT

by Manuel Herranz Machine translation is a hot topic – and will be a hot topic for some years to come. But it is not only a hot topic with a lot of mystifying hype around it. Elia Yuste, Andreas Thömel and (I hope) myself came a little bit closer to clearing doubts about what is a customized machine translation engine, what is machine translation DIY. Our Gala Webinar aimed at clarifying some misconceptions about language companies building their own tools and applying them successfully in the language market. The language industry is a very varied industry, with few technological players which dominate the landscape and a myriad of smaller tools which fit many purposes, large and small. PangeaMT was born as the technological division solving the needs of a translation company. PangeaMT now has a life of its own and it is a well-respected, mature technology. The presentation is available already […]

Translation Technologies at LocWorld (Part 1: Moses)

by Manuel Herranz I attended Localization World London both as a guest speaker on what I call an upsurge in machine translation, almost a “transition frenzy” towards post-editing “future stability” within the EU-sponsored MosesCore project, organized by TAUS, and as an exhibitor of PangeaMT’s DIY SMT machine translation technologies. The session formed part of the Pre-Conference Day and it was a lively session with plenty of Q&A from attendees, reflecting the high interest MT has stirred among translation users and practitioners nowadays. Prof. Hieu Hoang provided a general introduction to what an SMT system is as translation technology, as well as what translation and language models are. The distinction between a translation model and, probabilities of phrases to figure out how the output sentence is grammatically correct, proper re-ordered, etc. Prof. Hoang related the story of how he originally updated Pharaoh to replace Moses and now only maintains it, as […]

Moses is not the new Messiah

by Manuel Herranz If you run a translation company or translation department or have some sort of connection with the translation industry, you have noticed without a doubt that MT (or automatic translation) is the flavor of the year in 2010… and will be for many years to come. It has and will change the way do things in this industry.  Several factors have been an unstoppable increase in the globalization of services and support, smaller budgets from buyers, an increase in international trading of services and the need for more content and more multilingual content in more languages. As of May 2009, there were 487 billion gigabytes of data which were increasing 50% a year (Oracle) or doubling every 11 hours (IBM). There are both exogenous and endogenous factors for things to reach maturity level now and not earlier or later. Among the latter factors we may include the fact that the bases did already […]

How to measure machine-translation quality

Many people have asked me how they can reliably use a system to measure/benchmark the quality of their translation system (rule-based, example-based or statistical). They have bought some commercial rule-based software and are trying it, building dictionaries and normalization rules or they are having a first try at what it means to deal with a Moses engine. There are two free systems which can be used as input/output and that will give you an idea of how your system is scoring. Some people use them to test their system versus Google Translator, raw MT output or other texts. You can use it, for example, to check how your system is doing in comparison with free GT, Systran online tools, BabelFish, etc. It may give you an idea of your progress as you customize your own tool for a particular application, taking generalist online tools as a basic reference. The tests […]