TMX filter

Hybrid Machine Translation Use Case at TAUS Tokyo Forum

by Hirokazu Suzuki and Manuel Herranz The second TAUS Tokyo Forum was held on April 19th and 20th2012 at the Aoyama Centre, hosted by Oracle Japan. The Forum had to be cancelled last year as a result of the earthquake disaster which hit Japan on March 11th, 2011. Otherwise, the forum would have run its third edition this year. All of the participants from Japan rejoiced warm-heartedly to be able to take part in the forum again. The main topic of TAUS Executive Forum was use cases of MT technologies and innovative business models. Dieu Tran (Cisco) and Alan Chung (SDL), who received a TAUS award for the best use case this time, talked about their integrated MT/TM system that makes effective use of SMT. Suguru Sakanishi (Yaraku, Inc.) and Miori Sagara (Baobab) presented their collaborative translation platforms that combine MT technology and human resources. Crowdsourcing translation with MT technologies […]

Pangeanic’s participation in TAUS Copenhagen 2010

by Elia Yuste TAUS has been tracking the exciting experiences of companies pioneering in a radical new MT engine training space for the last year or so. Pangeanic is one of the most outstanding cases, and so we were advertised as the first LSP to create a new business stream with TAUS Data Association (TDA) data earlier on this year. Then, PangeaMT, Pangeanic´s technological division geared at customized MT solutions and consulting, was invited to take part in the proof-of-concept of TAUS MT Trainer and present its results on the occasion of the TAUS Executive Forum in Copenhagen in late May 2010. The idea behind this MT Trainer, a web-based facility from TAUS TDA that will materialize within the current year, is twofold: first, to foster pro-active adoption of TDA data for MT engine training; and second, to connect MT service commissioners and providers under the TAUS umbrella, whereby the former may submit their data files (reference files for engine training […]

Comment to SDL’s “Sharing Data between Companies – is it the Holy Grail?”

by Manuel Herranz Eye openers about data sharing (or data mixing) abound nowadays. The kick start for TM leveraging, automation and faster solutions has come from outside our beloved language industry in the shape of – algorithms that create language (SMT) and their application/business by players inside and outside the industry (from Google Translate to new MT entrants and offsprings) – a credit crunch and a financial crisis that is leading companies to rethink the unthinkable A few times (exceptions) language professionals have joined to actually innovate and come up with something really new, mostly crowdsourcing, in translation, in frameworks, in workflows. Never mind, it is seldom the norm that busy people have the time to innovate. It takes a shot from outside a particular industry to shake the foundations or to force to change things. (Let’s assume from a positivist point of view that change is for the better). […]

How to measure machine-translation quality

Many people have asked me how they can reliably use a system to measure/benchmark the quality of their translation system (rule-based, example-based or statistical). They have bought some commercial rule-based software and are trying it, building dictionaries and normalization rules or they are having a first try at what it means to deal with a Moses engine. There are two free systems which can be used as input/output and that will give you an idea of how your system is scoring. Some people use them to test their system versus Google Translator, raw MT output or other texts. You can use it, for example, to check how your system is doing in comparison with free GT, Systran online tools, BabelFish, etc. It may give you an idea of your progress as you customize your own tool for a particular application, taking generalist online tools as a basic reference. The tests […]