Featured Image

1 min read

05/06/2014

Pangeanic machine translation surpasses Google in English-Korean in the technical domain

Pangeanic machine translation division PangeaMT  has contributed to a research article within the project Developing Talent of TAUS which will be published in shortly. The article analyzes a number of changes made in the Moses version in which PangeaMT is based, in order to see how positively they affect the translation between English and Korean in the software and electronics domain with  data aupplied by TAUS. The engines created for the purpose of the article are finally compared with the results of translations obtained from Google Translate. The analysis shows that the Moses version enhanced in PangeaMT surpasses Google Translate clearly in both target domains and studied in a third one that combines all the data. One of the reasons for this is that Google uses generalist engines  for purely information translations or to extricate meaning out of texts which otherwise would not be understood. However, for more specific translations, this does not have to be necessarily the most suitable. The advantage of custom-built engines means that they contain the terminology and follow the style of the owner or target field (technical, legal, software, medicine, marketing, etc.). This way, they can obtain better results.
  • EN->KO
    • domainELECTRONICS +17,02 over Google
    • domain SOFTWARE +15,77 over Google
    • domain ELECTRONICS+SOFTWARE +15,26 over Google
  • KO->EN
    • domain ELECTRONICS +18,83 over Google
    • domain SOFTWARE +15,44 over Google
    • domain ELECTRONICS+SOFTWARE +14,61 over Google
 
 

Related Posts

The Creation of Custom Data Sets to Meet Customer Needs: A BSC Project

Rapidly advancing technology and the growing need for accurate and efficient data analysis have led organizations to seek customized data sets tailored to their specific needs. 

Read more

Exploring the Differences Between Human Translation and Machine Translation

The technological advances that have occurred over the course of the last few decades have made it possible to optimize and streamline the work of human translators. One of these advances is machine translation (MT).

Read more

Synthetic Data vs Anonymized Data

What is synthetic data? 

Synthetic data is data that has been artificially generated from a model trained to reproduce the characteristics and structure of the original data. The goal is for the synthetic data to be sufficiently similar to the...

Read more