Pangeanic professionals collaborate in this new edition by carrying out the human translation of Spanish, Catalan and Portuguese from the development and test files for the machine translation competition on
Similar Language Translation for the
WMT 2020 conference. Their collaboration is aimed at evaluating machine translation systems. A part of the conference, which has been held annually since 2006, consists of competitions of machine translation systems developed by universities or companies on tasks that are challenging for these systems. In addition, this well-known international competition also includes tasks of automatic post-editing, quality estimation and parallel corpus filtering. On the other hand, the WMT 2020 conference is a great contribution for publishing scientific papers and descriptions of the machine translation systems that have competed. [caption id="attachment_15135" align="aligncenter" width="417"]
Portrait EMNLP2020 conference. Image credit: NASA[/caption] The event will take place online on the 19th and 20th of November, and is one of the events within the
EMNLP 2020 conference, one of the most important on natural language processing globally (it is classified as Core a). Given the community's interest in the challenge of leveraging the similarity between languages to overcome quality goals in machine translation, the WMT 2020 conference will include for the second time the shared task on "Similar Language Translation" to assess the performance of cutting-edge translation systems between pairs of languages from the same family. This year we have five similar language pairs from three different language families:
- Translations of Indo-Aryan languages: Hindi-Marathi.
- Translations of Romance languages: Spanish-Catalan and Spanish-Portuguese.
- Translations of South-Slavic languages: Slovak-Croatian and Slovak-Serbian.
Translations will be evaluated in both directions (e.g. from Spanish to Catalan and Catalan to Spanish). The EMNLP conference includes workshops, tutorials, posters, demos and specialized sessions in Machine Learning, Semantics, Dialog, Sentiment Analysis, Information Retrieval, Summarization, Speech, Machine Translation, etc. for the presentation of scientific papers. The situation of international health alert and the particular circumstances of confinement in many countries have not disappointed the organizers of major events that will take place virtually in 2020. Pangeanic continues to actively collaborate as
previous years in the research and development of machine translation and natural language processing, along with its consolidated technological team, in its
PangeaMT division. Although this year we cannot attend lectures in person, we will present 5 articles online.