1 min read

Translation Technologies at LocWorld

I attended Localization World London both as a guest speaker on what I call an upsurge in machine translation, almost a “transition frenzy” towards post-editing “future stability” within the EU-sponsored MosesCore project, organized by TAUS, and as an exhibitor of PangeaMT’s DIY SMT machine translation technologies. The session formed part of the Pre-Conference Day and it was a lively session with plenty of Q&A from attendees, reflecting the high interest MT has stirred among translation users and practitioners nowadays.

Prof. Hieu Hoang provided a general introduction to what an SMT system is as translation technology, as well as what translation and language models are. The distinction between a translation model and, probabilities of phrases to figure out how the output sentence is grammatically correct, proper re-ordered, etc. Prof. Hoang related the story of how he originally updated Pharaoh to replace Moses and now only maintains it, as it has become a community-based project in which he stopped being the largest contributor long ago.  He cleared some misconceptions about Moses:

  • “only runs on linux”: NO. Mostly, but also in Windows 7 (32-bit) with Cygwin 6.1; Max OSX 10.7 with MacPorts; Ubuntu 12.10, 32 and 64 bit, Debian, Fedora, OpenSuse.

  • “difficult to use”: it is now easier to compile and install with Boost bjam so no installation is required. Binaries are available for Linux, Mac, Windows Cygwin. As it did in the past, it contains ready-made models trained on Europarl

  • “unreliable”: Absolutely not! The community monitors check-ins, there are more and more regression tests, nightly tests running end-to-end training (see statmt.org/moses/cruise). Moses has been tested on all major OSes. It offers models already trained on Europarl models in 8 languages and they work.
  • “Only phrase-based”: NO. From the beginning, it is an extension not a replacement of Pharaoh.
  • It is not developed by 1 person but by a community!
  • Some people claim it is slow. Really? It is fast enough and surpasses previous engines, we are talking about milliseconds to produce translations. Actually, thanks to Ken Heathfield, it is multithreading and we’ve added reduced disk I/O and disk space requirement.
Next time you think languages, think Pangeanic Your Machine Translation Customization Solutions



Artículos relacionados

Pangeanic: the solutions you need in 2023

After years of standstill and uncertainty about what the future held in all sectors, 2022 gave us a taste of long-forgotten "normality" as different ways of working and new opportunities emerged and stimulated markets.

Leer más

Interview With María Grandury on Artificial Intelligence and NLP

At the young age of 25, María Grandury has already made a name for herself in the field of Artificial Intelligence in Spain. Just two years ago, in the middle of the pandemic, she was finishing her double degree in mathematics and physics. During...

Leer más

What Is Software Localization?

Software localization services facilitate global communication regardless of physical borders. They offer an advantage for the user and, mostly, for developers and companies providing online services and applications.

Leer más