Sponsoring EAMT virtual conference 2020

Due to current traveling restrictions, the conference was held online

The 2020 edition was virtually celebrated from Lisbon

Pangeanic has sponsored the European Association Machine Translation (EAMT) 2020 conference . Due to current traveling restrictions, the conference was held online. It had been postponed from its original summer date in Lisbon to 3-5 November 2020. The conference included a research track, a translator track, a user track and a project-products track. The different racks allowed the exchange of ideas between researchers, practitioners, machine translation technology developers and the translation industry in general. Pangeanic and its technological division PangeaMT presented 3 posters:

One poster in the user track called “A User Study of the Incremental Learning in NMT” [video] in collaboration with the research group PRHLT of the Polytechnic University of Valencia. This industry-academic collaboration carried out a study involving professional translators. Online learning was performed (training engines as human feedback was provided). User opinions on engine improvement are provided in the study.The conclusion of our research is that users of machine translation systems are much happier and yield more words per hour when engines are adapted with their own output as neural machine translation engines learn to mimic their style quickly.
A poster in project-products track about our MAPA CEF project were an anonymization toolkit is being developed. MAPA stands for Multilingual Anonimyzation for Public Administrations which Pangeanic leads together with European data organizations and the French National Research Center, among others.
Pangeanic’s last poster described NTEU CEF project in the project-products track. The Neural Translation for EU will provide the largest-ever 1-1 engine farm of neural machine translation engines (not pivoting through English). All EU official language combinations are tackled (506 language arcs) including low-resourced languages.The consortium, led by Pangeanic, will also provide a large data set of around 15M perfectly clean sentences to European data organizations and repositories such as ELRC under license.