Author Archives: Pangeanic

Disintermediation – The Uber of Translation and iADAATPA open source multi-MT platform

by Manuel Herranz

Speaking at two conferences in two very different scenarios and to two very different audiences gives you the precious opportunity to get a taste of what the market thinks, the fears and the wishes. By market I mean people that represent business, companies or represent themselves but they have an influence on what others think in their profession. In my case, this is happened in Athens at the Elia Together conference and at Gala Boston recently. Although the presentations were targeted at two very different groups, they shared some common ground. Both audiences contained professional translators and linguists and representatives of translation companies. The presentation in Athens dealt more with the future of translation as a profession, some marketing development tips and a short summary of our iADAATPA open source multi-MT platform project which inevitably led to the question of “How does neural machine translation work”?. The presentation in Boston was solely dedicated at the iADAATPA platform and how it will become an open source MT platform with ready-tested plugins and APIs to CAT tools and MT services, with a little overlap on how does neural machine translation work. Let me share with you a few things I learnt from talking to professional translators and translation business over the last two months.

Elia Together Athens

The Uber of translation

Full house. More than 100 people sat to take away some lessons about how the business model for medium and small translation agencies must change (well, even for the model of the large ones) and embrace B2C. It’s not that the typical cascade buying and selling will disappear, but many companies and even freelancers have not realized yet how many times people use Google nowadays to search for services… including translation services. And it doesn’t matter if you are in Boston, Athens, Osaka, Guadalajara City or Lyon. Well-delivered services can be anywhere as long as they can deliver what they promise. Look at Amazon. I hammered home the idea that few translation companies actually do what they say they do (translate, for example their own website in 10 languages). They don’t speed and solve communication problems (they actually create a layer of administrative work). Who will be the Uber of the Translation Industry - a tweet by Marina Oreskovic The point about disintermediation seemed to be interesting more for companies than for freelancers. As large text loads head towards higher quality neural machine translation (which begins to be almost indistinguishable from humans), the idea of microservices and hourly rates begins to settle in the minds of translation managers and translators alike. A mixed audience as it was (half freelancers, half companies), I was surprised to see the interest in iADAATPA open source multi-MT platform that will not only provide free machine translation to public administrations via eTranslation from the European Commission, but will also be free to for use as an multi-MT vendor platform by private individual and industry. Having a single hub where all your connectors and APIs are already in place so you can use best performing engines in domains and language pairs from several MT vendors is attractive not only for the European Union. It is attractive to anybody that would like an on-premise machine translation platform. In 2006, Google ditched a ruled-based system (Systran) to develop its own platform. In 2010 the first statistical machine translation with re-training features was released as PangeaMT. Many followed, mostly as SaaS and cloud services. But no full MT management platform has ever become open source. APIs and connectors have remained fiercely proprietary. iADAATPA will make MT much easier to run and change from one provider to another, even at company level.

A version of my presentation can be seen in slideshare
Disintermediation presentation Elia together Athens 2018 Manuel Herranz

Gala Boston 2018

The Language of Business. The Business of Language

Language conferences are a thing to see. They really are different to most types of conferences: the interaction, the almost incestuous in-selling… you don’t see rookies, middle management or small practices in tool machine exhibitions, life sciences conferences, not even in toy and fair exhibitions. At language conferences you can speak to a tool developer who doubles up in the executive board, representatives of think tanks and partners in a small business. My talk this time was for a smaller audience and more focused on iADDATPA as an EU project (part of the Connecting Europe Facility initiative). I had kept some common slides from Greece to briefly explain how neural machine translation works but I didn’t get there practically. I invited the audience to ask any time as I introduced the platform and a very inquisitive audience did just that. So no more than 7 minutes into the presentation, the audience was asking the right questions: “When will be able to get our hands on it?” “Is this the beginning of a dynamic benchmark?” “A real marketplace of machine translation where the system decides the best system producing a paragraph, or a sentence?” “How will you decide intelligently one system is better than the other?” “What happens if you re-train?” “Will it be able to re-train own engines?” “Can iADAATPA be deployed on-premise?”. All I could think is that audiences, or the market, is very eager not just to adopt the technology in passive mode, but to own it, play with it, customize it and be able to run its own machine translation hub for its clients. Linking up to the ELIA presentation in Athens, I sense the savvy business have already realized that there is a huge change in the business model in the pipeline and that MT is not the realm of the top ten companies in the world or large multinational organizations. It’s likely to become the CAT tool and the service everyone provides in the near future. I couldn’t attend all talks, but I’d like to summarize a very worthy initiative: TAPICC, from Gala itself. Currently API landscape is a wild west with a lot of unnecessary variation and a continuous reinvention of the wheel, with a lot of wasted money for clients, LSPs and tool vendors, leading to a loss of operational freedom. The aim of TAPICC is to unify the API scenario, working with industry, academics, in a similar vein as iADAATPA is doing by bringing together several machine translation players offering a single access point and common APIs to connect to many CAT tools and MT deployment scenarios. A picture of Gala attendees TAPICC has created several working groups and agreed upon uses and a useful framework for metadata, use cases, best practices, and classes, so it is now looking for quickly implementable classes and use cases and iADAATPA can well be one of them by deploying its standard as a single access point for machine translation services. Thus,  it will reduce the cost of integration and will allow for quickly onboard new clients and systems, LSPs, etc., finally easily embedding L10N in content processes and enterprises. A 2nd objective of TAPICC is the interaction between different systems so that 2 TMs tools can talk to each other. This is happening already on a proprietary level and they are all different and TAPICC wants to support everything. A good use scenario, for example, can be the semantic enrichment of units, terminology, TM and MT for example in an xliff file that asks for terminology results and with a layout that is “good enough”. In one word, it can be as simple as having a file that needs to be sent to a machine translation engine but keeping the same format. As TAPICC is already looking for use cases, the API specification can start to be tested and shaped during development stage.

A version of this presentation can also be seen in slideshare.

iADAATPA open source multi-MT presentation at GALA Boston

Towards a multilingual world: the future of the English language in the 21st Century

by Carolina Herranz-Carr 

Carolina is an Account Executive at East Creative Agency, a voice-over artist in English and Spanish and a graduate from Brunel University. Her passion for languages comes from her bilingual background.

To share a common language and a cultural aptitude are well-known determinants of trade, a belief long held by successful businesses in their international marketing campaigns and trade negotiations. With the far-reaching postcolonial spread of the English language and US dominance in the Western hemisphere, international trade, global politics and cultural interactions soon adopted English as the common ‘lingua franca’, whilst Russian dominated Eastern Europe and French most of Western Africa. This phenomenon accelerated the globalised world of today and created several cultural spheres of influence where none existed before and strengthening older links in some cases. However, in the face of an unsettled post-Brexit anxiety, a perceived void in US global leadership, and China’s entering in the big league of global markets, many are beginning to contemplate the socio-economic consequences of this global shift in direction. What will be the impact on culture, language and business if China is to move towards the centre stage in the world?

A multilingual world and the English language

China’s narrative, lead by Xi Jinping sets out a vision for the years ahead; In his speech, at the 19th National Congress of the Chinese Communist Party, he pronounced a “new era” for China. Pledging for further liberalisation of its markets (while simultaneously calling for stronger state firms). China’s confirmation for its commitment to free trade and its insistence that it mustn’t “hide its light under a bushel and be a modest player abroad”, could mean a golden age for globalisation; generating new opportunities for business and trade. However, China’s integration into the international economy is built upon a socialist market, and carries with it a considerably different cultural approach in comparison to its Western trade partners. The importance of mutual understanding, linguistic and cultural connection will be the key drivers of a successful 21st Century. Additionally, as China’s presence in the global stage rises, a potential popularity of the Chinese language abroad could also be on the increase. Spanish, Portuguese French and the English language grew from their home basis in Europe as a result of commercial expansion from the 15th century onwards.

An initiative carried out by the National Security Education Program (NSEP) at the U.S. Department of Defence, highlighted the importance of language skills for international business. Within their Language Flagship report, they noted how the state of Washington claimed to a loss of revenues due to inefficient translation of training contracts and curricular materials into Chinese and other languages. The NSEP concluded that, the preclusion of revenues and relationships that derived from mistranslation, should be countered by cultural awareness within businesses, and an increase in a linguistically skilled workforce. This idea could prove exceptionally important for Europe and Asia in the years ahead, where an increase in trade is expected from Mr. Xi’s ambitious $900 billion New Silk Road Initiative; involving a hefty investment on infrastructure in more than 60 countries. If successful, this activity would centre Eurasian trade on China. Likewise, a post-Brexit UK could also see a shift in trade intensity with China in an effort to connect with fresh markets overseas. If China and the UK sign a FTA, a rise in competition between the EU and Britain could further shape China’s role in the world.

As the divorce date looms closer, the UK must consider the fate of one of its essential, perhaps overlooked assets; the English language as an economic resource (see more in our “The demise of English as an international language?“). Britain has long enjoyed the lucrative benefits of its language dominating the vocabulary of diplomacy, cross-cultural communication and trade, leaving no real incentive to learn another language. However, the possibility of an EU without Britain has left many reassessing the relevance of English as an official language within the union. In the Republic of Ireland, the official language is Gaelic and not English and in Malta, where the majority of people speak English after a period of colonisation, the native language remains Maltese; the only Semitic language spoken in Europe. Some speculate there will be an increase of pressure on EPSO, the EU’s recruitment agency, to discontinue English as one of the three required languages, leaving French and German as likely substitutes. 

Brexit - United Kingdom and European Union

Nonetheless, while the replacement of the English as the cross-cultural communication of choice may be too bold an assumption; with 38% of Europeans currently speaking English as a second language, it is conceivable to acknowledge that China’s economic rise could translate into more Mandarin speakers across the globe. Thus, we may slowly be progressing towards a multilingual world where the English language can play a role, but not the dominant role. In order to prosper and secure economic stability, Britain should not solely rely on its own language to nurture the development and advancement of trade overseas. As stated on the GOV official website, UK business “should not assume Chinese firms will have English-speaking staff”. Alternatively, a strong understanding of the economic, linguistic and cultural differences are crucial if the UK pretends to connect with foreign markets.

Both Brexit and Trumps-lead US, have provoked food for thought about the future political landscape, leading many to ponder upon their own role in the affairs. Canada’s Prime Minister Justin Trudeau regarded Canada’s significant allies and trading partners the U.S. and the U.K. as “turning inward”, urging Canada should take advantage of their isolated approach to seize new opportunities abroad. In the same vein, Mr. Xi championed an “enlightened new socialism” amid “crises and chaos” in Western liberal democracies. The future of trade and business will depend on mutual trust, improved language skills in business and a strong understanding of China’s cultural and political landscape as it prepares to further open its markets. Without linguistic connection through common language, efficient translation of services and a cultivation of strategic language policies, the barriers of mistranslation could gradually take their toll.

Our Nordic Translation Industry Forum Blog diary!

by Garth Hedenskog

From Wednesday, the 22nd until Friday the 24th of November, Pangeanic traveled north to Helsinki to attend our first Nordic Translation Industry Forum! And what an amazing event it was. View of Helsinki City Center in the early evening with tram Let’s start off by saying that Helsinki is a stunning location for business or pleasure. The team was greeted by light snow and a high of 1ºC for most of the 4 day stay! Garth Hedenskog (our sales director) and Alex Helle (our chief research and developer) were lucky enough to go this year. This was naturally a very popular event/destination to attend with a lot of staff at Pangeanic very eager to go! Here is Garth and Alex trying to look busy with at the interpreting challenge, they didn’t fool anyone! Garth Hedenskog and Alex Helle Garth and Alex of course didn’t just go for the beautiful scenery, adventure and crisp fresh Nordic air, they went to showcase Pangeanic’s technology, see what the Nordic translation industry had to offer and of course let their hair down and mingle with our amazing localization colleagues. And this is how the 3 days unfolded in their own words….. The reception was apparently wonderful but our flight unfortunately got in a little late so missed the whole thing! Chatting with our colleagues the next morning, we were sorry to have missed it! Day 1: We got in nice and early to setup our stand. The people at NTIF we great and made the whole experience truly unforgettable. Everything functioned like clockwork and all support and assistance made the whole setup a really simple and fun experience.Pangea Machine Translation banner Throughout day 1 we were treated to some really inspiration and unforgettable speeches and here were some of them… Klaus Fleischman from Kaleidoscope spoke about the power of words or how we never get a second chance to make a first impression Gábor Bessenyei, MorphoLogic Localisation addressed how custom neural MT engines are at your fingertips Lara Millmow, from Elia gave an interesting presentation about taking a broad view on your business Ádám Marjai, memoQ spoke about their Translation project management solution We were treated to many other very interesting presentations. That night we were treated to an exquisite dinner followed by some questionable dancing! We had Reindeer for dinner, first time for many. As they say, when in Rome… The location was absolutely beautiful, right near the water of the Baltic sea. We had a scrumptious 3 course meal, drinks, dancing and fascinating conversations. What more could we ask for. Alex Helle, Garth Hedenskog and GALA Board Director Tea Tea C Dietterich The night absolutely flew by which no doubt proved how successful and fun it was. Day 2: Day 2 started with our gracious hosts laying out a spread of healthy juices, snacks (there may or may not have been miniature bottles of vodka available) and most importantly some headache tablets for the weary party animals. Alex and I were of course fresh as daisies (kind of)…. Day 2 was also sadly our last day so we really tried to speak to all the visitors and exhibitors we hadn’t had time to meet with during the previous day. Our neighbors (Memsource) had one of those virtual reality games which was a great energy boost when we needed it! Thanks Memsource! Again the talent presenting was incredible so we snuck away as often as we could to learn as much as possible about trends and best practices.

With Jaba Translations CEO Joaquim Alves

With Jaba Translations CEO Joaquim Alves

Again, some of the standout presentations for us were: Salvo Giammarresi from PayPal explaining what Globalization is and how LSP (traditionally Language Service Provider) should strive to be more Language Service Partners. We were amazed by the talent and expertise on show at the Interpreting Software Challenge. It was really inspiration stuff and the company’s founders came from some really disadvantages backgrounds. I’d like to mention them all and encourage you to check their companies out. Tulka Interprefy Interactio Tikktalk Youpret Kudo We had a delicious lunch kindly sponsored by AAC Global where we had an opportunity to see some amazing friends. 20171124_132314 Our booth was very busy with students and LSPs learning about the latest developments in the Neural Machine Translation field, in particular about Pangeanic’s PangeaMT solution. We demonstrated how it is completely compatible with our centralized translation memory system – ActivaTM.  Cor (our web based translation management system with built in crawler) was also very popular and we couldn’t have been happier telling everyone who would listen all about it. We could’ve easily stayed another few days but we sadly had to pack up that afternoon at about 5pm but not before we were involved with the prize giving and farewells where we gave an amazing Fitbit Alta HR away. The lucky winner was …..drum roll please….. Karel Mostek from Moravia! Thank you Helsinki and NTIF, it was amazing! We look forward to being back next year!

french bulldog with French beret hat and French flag behind a laptop

How can a global brand be more local? 3 tips to reach more customers

If you have been working hard to develop and set up an eCommerce site, you know that is the first step in a long journey. Now you need clients to find your site. Adjusting your eCommerce site to the feel and look of local markets is essential because it reassures customers that you have paid enough time and consideration for your brand to be more local to them. You are considering their language, their local customs and traditions.

Three things to think about:

1) Not all images convey the same message

The majority of eCommerce sites are all about the visual and the price. Planning a little bit about images that will not fit the palate of some audiences is paramount. Think about the right kind of images for your eCommerce website. For example: high-quality images for your products are neutral and will often come for the product manufacturers themselves. However, if you are using your own images for your products, our recommendation as internationalization experts is to give them a little thought and see if they are appropriate for the countries and markets you are addressing. eCommerce sites sell everything, from books to furniture, drinks and even translation services… But fashion sites tend to be very popular and probably the most obvious example. Time after time, winners are the ones who carefully select and showcase their products because local markets react positively or negatively to whatever you put in front of them. cartoon lady in short jeansClearly the sense of fashion is different from country to country. Dress codes can be radically different from one culture to another. Ties are popular in the Western world, China, Japan and Korea, but there are differences in use. Some items can be plainly offensive to some local customers. Tight-fitting clothes are not the thing to market in India, not above-the-knee short jeans or low necklines in Morocco, Algeria, Egypt.

If you are using your own images for your products, our recommendation as internationalization experts is to give them a little thought and see if they are appropriate for the countries and markets you are addressing.

The Middle East is experiencing a great burst in eCommerce. The eCommerce sector in MENA (Middle East and North African countries) reached US$10 billion mark in 2016 and it is ‘to grow tenfold’ by 2020. A pan-Arab government body headquartered in Cairo is preparing to release a five-year strategy. Comprising representatives from 14 governments across the MENA region, it claims the region’s e-commerce sector will leap from $20 billion in 2017 to $200 billion beyond 2020 – and international fashion brands who feature images which offend customers on religious or cultural grounds will fail.

2) Colors Your website design should also take into account that the choice of colors affects the subconscious when visitors first land on your site. Take red for instance: the color of passion in most Western countries, red is associated with mourning in South Africa. The section of red in the country’s flag symbolizes violence and sacrifices that were made during the struggle for independence.

Shanghai Temple

Shanghai Temple

In China, red represents celebration, happiness. It is everywhere in China: buildings, roofs, clothes, websites. The Chinese flag is red. Ask any Chinese translator about red: it is meant to bring luck, prosperity, happiness, and a long life to the people. And how about yellow? No bullfighter will dare to come out dressed in yellow in Spain and Latin American countries. It brings bad luck! In China and Japan, yellow is associated with direty sex or pornography. Chinese and Japanese use the term “yellow film”, “yellow joke” or “yellow book” meaning pornographic films, jokes or books.

Spaniards would call them “green”! In France yellow stands for jealously, betrayal, weakness, and contradiction. In the Middle Ages, people painted the doors of traitors and criminals yellow. Yellow is reserved only to people of high rank in many African nations, because it is easily associated to gold, and gold is money, quality, success. And in Germany, yellow symbolizes jealousy. Want to know more? If you are reading this blog from the US or Europe, you will agree blue is a strange color. It is the traditional color for boys (pink for girls). It is also a color that stands for trust, security, and authority (many conservative parties use light shades of blue) as well as banks and institutions (Citibank and Bank of America). In China, blue is considered a feminine color. In Judaism, blue is the shade for holiness and divinity (the Virgin Mary is depicted in blue in Catholic countries).

In Hinduism, blue is the color of Krishna—the most highly worshipped Hindu god who embodies love and joy, and destroys pain and sin. Talking about pink, this color translates as “foreign color” in Chinese as it was unrecognized until it emerged into the culture due to increasing Western influences. Orange has almost positive feelings everywhere: in many Western cultures, orange is considered a fun and edgy color, and represents curiosity, a thirst for the new, for creativity. In Japan and China, orange is also positive, being linked to good health, courage, happiness and love. And in India, it’s symbolic of fire. The orange-colored spice, saffron, is considered to be lucky and sacred. Only in many Middle Eastern countries, such as Egypt, orange is associated with mourning.

3) Logo and Tagline

We would recommend changing the logo unless there was something terribly wrong in a particular country or culture. The tagline is something else. It may be culturally attached to the country it comes from. We have dealt with some mistakes in that kind of translation which have become part of the translation folklore. Follow this link to read and have a good time. Translations are the most cost-effective way to help your website appear in the search result pages of international markets. Translating an eCommerce is the way not to depend on a single, home market.  73% of customer prefer to buy goods in their local language. eCommerce customers move quickly (at the click of a button!) when it comes to deciding if they will buy from a website. But please remember, a properly localized website makes life a whole lot easier to beat the competition.

The Pangeanic neural translation project

The last few months have been extraordinarily busy at Pangeanic, with a focus on the application neural networks for machine translation (neural machine translation) with tests into 7 languages (Japanese, Russian, Portuguese, French, Italian, German, Spanish), the completion of a national R&D project (Cor technology as a platform for translation companies offering an integrated way of analyzing and managing website translation and document analysis), the integration of CAT-agnostic translation memory system ActivaTM into Cor and our neural engines, and the award by the European Union’s CEF (Connecting Europe Facility) of the largest digital infrastructure project to build secure connectors to commercial MT vendors and the EU’s own machine translation service (MT@EC) for public administrations across Europe. Leading machine translation developers such as KantanMT, Prompsit, Tilde and our PangeaMT join forces with consulting company Everis to build IADAATPA, a system that will intelligently work on domain adaptation and the selection of the most appropriate engines through secure connectors for Public Administrations in the EU.

So, time to recap and describe our experience with neural machine translation and how Pangeanic has decided to shift all its efforts into neural networks and leave the statistical approach as a support technology for hybridization.

The Pangeanic neural translation project

We selected training sets from our SMT engines as clean data to train the same engines with the same data and run parallel human evaluation between the output of each system (existing statistical machine translation engines) and the new engines produced by neural systems. We are aware that if data cleaning was very important in a statistical system, it is even more so with neural networks. We could not add additional material because we wanted to be certain that we were comparing exactly the same data but trained with two different approaches.

A small percentage of bad or dirty data can have a detrimental effect on SMT systems, but if it is small enough, statistics will take care of it and won’t let it feed through the system (although it can also have a far worse side effect, which is lowering statistics all over certain n-grams).

Visual sample of statistical candidates with best candidate proposed in a statistical machine translation system

Visual sample of statistical candidates with best candidate proposed in a statistical machine translation system

We selected the same training data for languages which we knew were performing very well in SMT (French, Spanish, Portuguese) as well as those that have been known to researchers and practitioners as “the hard lot”: Russian as the example of a very rich morphologically language and Japanese as a language with a radically different grammatical structure where re-ordering (that’s what hybrid systems have done) has proven to be the only way to improve.

Japanese neural translation tests

Let’s concentrate first with the neural translation results in Japanese as they represent the quantum leap in machine translation we all have been waiting for. These results were presented at TAUS Tokyo last April. (See our previous post TAUS Tokyo Summit: improvements in neural machine translation in Japanese are real).

Japanese neural translation engine for the electronics and IT field

Tokenizer.perl and Mecab were used for English and Japanese tokenization respectively.

We used a large training corpus of 4.6 million sentences (that is nearly 60 million running words in English and 76 million in Japanese). In vocabulary terms, that meant 491,600 English words and 283,800 character-words in Japanese. Yes, our brains are able to “compute” all that much and even more, if we add all types of conjugations, verb tenses, cases, etc. For testing purposes, we did what is supposed to do not to inflate percentage scores and took out 2,000 sentences before training started. This is a standard in all customization – a small sample is taken out so the engine that is generated translates what is likely to encounter. Any developer including the test corpus in the training set is likely to achieve very high scores (and will boast about it). But BLEU scores have always been about checking domain engines within MT systems, not across systems (among other things because the training sets have always been different so a corpus containing many repetitions or the same or similar sentences will obviously produce higher scores). We also made sure that no sentences were repeated and even similar sentences had been stripped out of the training corpus in order to achieve as much variety as possible. This may produce lower scores compared to other systems, but the results are cleaner and progress can be monitored very easily. This has been the way in academic competitions and has ensured good-quality engines over the years.

The standard automatic metric in SMT did not detect much difference between the output in NMT and the output in SMT.

BLEU does not detect the huge difference in perceived quality - WER is a better indicator

BLEU does not detect the huge difference in perceived quality – WER is a better indicator

However, WER was showing a new and distinct tendency.

NMT versus SMT results in Japanese

NMT shows better results in longer sentences in Japanese. SMT seems to be more certain in shorter sentences (training a 5 n-gram system)

And this new distinct tendency is what we picked up when the output was evaluated by human linguists. We used Japanese LSP Business Interactive Japan to rank the output from a conservative point of view, from A to D, A being human quality translation, B a very good output that only requires a very small percentage of post-editing, C an average output where some meaning can be extracted but serious post-editing is required and D a very low quality translation without no meaning. Interestingly, our trained statistical MT systems performed better than the neural systems in sentences shorter than 10 words. We can assume that statistical systems are more certain in these cases when they are only dealing with simple sentences with enough n-grams giving evidence of a good matching pattern.

We created an Excel sheet (below) for human evaluators with the original English to the left and the reference translation. The neural translation followed. Two columns were provided for the ranking and then the statistical output was provided.

A table showing original English and Japanese reference translation

Neural-SMT ENJP ranking comparison showing the original English and the reference translation, with the neural ranking to the left and the statistical system to the right

German, French, Spanish, Portuguese and Russian neural translation results

The shocking improvement came from the human evaluators themselves. The trend pointed to 90% of sentences being classed as perfect translations (naturally flowing) or B (containing all the meaning, with only minor post-editing required). The shift is remarkable in all language pairs, including Japanese, moving from an “OK experience” to a remarkable acceptance. In fact, only 6% of sentences were classed as a D (“incomprehensible / unintelligible”) in Russian, 1% in French and 2% in German. Portuguese was independently evaluated by translation company Jaba Translations.

Human evaluation of neural translation in German, French, Russian

Human evaluation of neural translation in German, French, Spanish, Portuguese, Italian, Russian

This trend is not particular to Pangeanic only. Several presenters at TAUS Tokyo pointed to ratings around 90% for Japanese using off-the-shelf neural systems compared to carefully crafted hybrid systems. Systran, for one, confirmed that they are focusing only in neural research/artificial intelligence and throwing away years of rule-based work, statistical and hybrid efforts.

 

Systran’s position is meritorious and very forward thinking. Current papers and some MT providers still resist the fact that despite all the work we have done over the years, Multimodal Pattern Recognition has got the better hand. It was only computing power and the use of GPUs for training that was holding it behind. The above article at PangeaMT provides some information about what is changing in the automated translation landscape as we speak and an example of the first neural papers back in the 90’s which has guided much of our own R&D.

Neural networks: Are we heading towards the embedment of artificial intelligence in the translation business?

BLEU may be not the best indication of what is happening to the new neural machine translation systems, but it is an indicator. We were aware of other experiments and results by other companies pointing in a similar direction. Still, although the initial results may have made us think that there was no use to it, BLEU is a useful indicator – and in any case, it was always an indicator of an engine’s behavior not a true measure of an overall system versus another.  (See the wikipedia article https://en.wikipedia.org/wiki/Evaluation_of_machine_translation).

Machine translation companies and developers face a dilemma as they have to do without the research, connectors, plugins and automatic measuring techniques and build new ones. Building connectors and plugins is not so difficult. Changing the core from Moses to a neural system is another matter. NMT is produces amazing translations, but it is still pretty much a black box. Our results show that some kind of hybrid system using the best features of a SMT system is highly desirable and academic research is moving in that direction already – as it happened with SMT itself some years ago.

I brought some useful tips from my attendance to SlatorCon in London. One is that translation buyers are still in sheer need of affordable translation solutions that can centralize assets and workflows. Another one is that neural MT is taking center stage as the technology that can truly change the game. The most important one, I would say is that venture capital money is pouring into the translation industry because it sees strong similarities with other industries (advertising, for one) that were disrupted years ago and produced something new.

“There was not a lot of technical innovation in the advertising industry until the late 1990s,” observed Marcus Polke, Investment Director from Acton Capital Partners. “And then came the Internet, which bypassed and marginalized ad agencies as online and offline advertising transformed into a complex landscape.

Yes, the translation industry is at the peak of the neural networks hype. But looking at the whole picture and how artificial intelligence (pattern recognition) is being applied in several other areas, in order to produce intelligent reports, tendencies and data, NMT is here to stay – and it will change the game for many, as more content needs to be produced cheaply with post-edition, at light speed when good machine translation is good enough. Amazon and Aliexpress are not investing millions in MT for nothing – they want to reach people in their language with a high degree of accuracy and at a speed human translators cannot.

TAUS Tokyo Summit: improvements in neural machine translation in Japanese are real

Not that business plans are written in stone any longer, but efforts to provide an insight by experts are always welcome. TAUS Tokyo Summit provided a much awaited for set of good news about perceived human translation improvements in neural machine translation in Japanese. English-Japanese was a well-known difficult language pair for rule-based machine translation and statistical machine translation provided a really awful experience for many Japanese audiences. It has historically been one of the hardest language combinations to automate. It seems that neural machine translation may be the answer.

Day 1 – Where is the translation industry heading?

Jaap began by summarizing the latest meeting of thought leaders in Amsterdam who met in Amsterdam in order to brainstorm a potential landscape and priorities for the language industry in the five years. If machine translation hype was at its peak five years ago with statistical machine translation and all sort of hybrids, we are now beginning to experience the neural MT hype. But adopters and developers are much wiser. If data was king some years ago, it seems we may not need so much in the future. Datafication was a process started some years ago after an article called “The Unreasonable Effectiveness of Data” (Elon Halevy, Peter Norving, Fernando Pereira, 2010, Google). The article said that the more data the better if our aim was to collect data to train machine translation engines and models. The more data we had to teach the algorithms decide what was best, the better a statistical system would translate. The problem has always been the unclarity about copyright issues with translation data. For example, law is different between US and Europe with regards to translation ownership.

TAUS has been focusing in the development of tools and practical services to the translation industry it serves, such as

  • Machine Learning
  • Quality Dashboard
  • Machine Translation
  • Intelligent TM
  • Interoperability, etc.

The set of services and tools (such as DQF) may soon become industry standards and they can be used to benchmark and measure productivity in-house and also with other (anonymized) players. DQF is now available as an API and can collect data real time as translators work, without disturb them. It is a transparent model and reports can be tracked to track reports, statistics and benchmark against other translators.

Jaap mentioned that Europeans are very worried that Google and Microsoft to “fix the problem” and be left out of the language technology race, referring to one of his previous articles “The Brains but not the Guts”. Europe is exporting talent to the US, an army of language scientists who are helping those two giants overcome the language barrier. On the other hand, machine translation has been accepted, it is becoming an API. On a daily basis, output from machines is 500 times bigger than the output from all professional translators put together. The translation industry is growing but also changing radically. What companies do nowadays is not pure translation any longer but telemanagement, post-editing, transcreation services, project management crowdsourcing, telemarketing, etc.

Translation is datafied. We want to know everything happening in a translator’s environment so we can accurately measure how many segments are translated, or words per hour. Eye movement tracking and word suggestions have been around academia for some time but they have now crossed the barrier to commercial MT services. We even track translators’ social graphs, how the weather or news affect the translator, third party applications, how much leveraging from previous translations was used. All that information can help us to automate project management more and improve resource allocation. We are moving to a future where project management will also be automated.

An interesting parallel was drawn between industries when Jaap mentioned that food delivery people do not have a boss, they have an app. All they are interested in is where to pick up the food and where to deliver it. And that’s a kind of post-editor. Translation buyers are finding that some vendors send out their jobs out to the internet and freelancer translators do general machine translation and post-edit it. “I only had to do some minor fixes”, said one PM from a leading translation company. The fear is “how long until my client finds out he can do the same?”, that is how long until translation buyers find out they can post jobs on the internet (via an app, maybe) and pay post-editing rates to cut out project management fees? In short, will everything handled by robots in the near future? Pay-as-you-go models may change and users will become more active with the management of terminology, labelling, etc.

The representative from Athena Parthenos created some controversy by stating that creativity will help the industry survive as creativity is the realm of humans. Mark Seligman agreed as he said what machine translation cannot do is convey the emotions of humans, which is what marketing is all about. Chris Wendt, from Microsoft disagreed: “I have seen very creative neural translations”. Another possibility, according to Jaap was that post-editing will not longer be needed, there will be people behind dashboards and people doing the creative jobs.

Day 2 – Neural machine translation has cracked the language barrier in Japanese

But the juicy news came on Day 2. Presentations from Systran, Pangeanic and Google provided news about development of neural networks applied to machine translation with a particular accent on improvements in neural machine translation in Japanese, with Human Science reporting on post-editing from Google’s NMT API . Consensus run on neural machine translation producing more natural and fluent output than phrase-based MT. However, there are problems, too. Neural machine translation can produce unreliable output when confronted with unusual input or when a strictly literal rendering is desired. On the plus side, neural machine translation seems to be highly adaptable and it has the potential of being applied to other natural language tasks.

SDL presented UpLift, a technique similar to their old concordance check which combines words and small subsegment units which reminds me a lot of an old technique by Dejà Vu and Transit in the past. The difference now is that it is automatic, it is applied to all words in a sentence and shows translation. The back technology is the creation of a glossary “behind” the TM. This is done by creating an index (at the end of the day, when the PC is not used, according to their own recommendation). This is combined with syntactic analysis for Asian languages. The new version “repairs” fuzzy matched automatically if the difference is only a word or two (a feature also offered by our own ActivaTM). I found it striking to learn that SDL finds that people do not bother to re-use and re-train their own engines once created. Its automated training system has not been so successful (perhaps because of data privacy issues, since SDL is, at the end of the day, another LSP).

Mark Seligman gave an overview of speech to speech translation, particularly from a Japanese (and in general Asian) perspective with the first speech-to-speech product by LinguaTec (currently Lingenio) to 2017. Most of these products were ahead of their times. NEC had an Japanese-English, but the real watershed came with the app of Google Translate which gave birth to a speech translation. Jibbigo was happening in Europe at the time, too. Sony had one phrase-based app and Phraselater used by the US military. Mark provided an impressive speech-to-speech live Japanese-English translation over his app, SpeechTrans, and stated that “Google-type glasses” with subtitles or similar technology would be available in 3 years not 300.

Systran’s presentation provided a lot of information about their Open NMT initiative and how they have created a community á la Moses. I would like to write more about the value of this worthy initiative and how it may become a very significant force in a post-Moses world, although SMT systems will have life for some time.  The better outputs provided by neural machine translation in Japanese have prompted a kind of fever and much higher acceptance levels as phrase-based systems behaved with a higher degree of predictability with close language pairs. Morphologically-rich languages such as the Slavic family also proved notoriously hard to automate.

Our presentation offered information on our first results on engines built with identical datasets in French, German, Italian, Spanish, Portuguese, Russian and Japanese but using an SMT system and a neural network, with astonishing results. Systems built with identical data but in a different way (statistical versus NMT) provided rankings of “human quality” “almost human quality” in 80%-90% of the 250 sentences tested, including Russian. The improvements in neural machine translation in Japanese are real.NMT provides a better translation than the original translation

A copy of our presentation and results is available in slideshare

As Mark had previously done with a speech-to-speech system, Microsoft’s Chris Wendt provided a live test of his speech translator starting with the Star Trek sample (an alien and a human speaking to each other with a different device). The audience had to keep quiet so noise did not have an impact on the translation. Speech translation had been inspired by science fiction, yes, but it was now a reality (the same happened to Jules Verne submarines, Around the World in 80 Days, etc…) Microsoft’s neural network can accent English from non-native speakers as input. It works with Indian, French or Spanish accents, but it is not so good with strong German or Russian accents. He introduced TRUETEXT for cases where there are hesitations by actually saying what you are trying to say without hesitation, stops, etc., so that the input is more prone for machine learning.
Microsoft speech to speech translation demo
https://blog.pangeanic.com/wp-content/uploads/sites/3/2017/05/Transcribed Japanese text in Microsoft live demo

 

 

 

 

 

 

 

There are many potential uses of multilingual speech-to-speech technology: multilingual meetings, schools in the US and situations where there is one speaker and many are listening. I wonder if this may create an audience of “lazy” language learners? People asked questions to Chris in Japanese, Italian, Chinese (verbally) and Chris replied in English, which was shown in each language on the monitor. He then switched to his native German (switching the language settings in the device) and translation was provided as written text on the monitors. He still received questions in Singaporean Chinese but now the system was translating from his German into Japanese and Chinese. The system slowed down a little bit, but the leap was also great with a lot of people asking questions. Chris stated that English-Spanish is the best working combination as they are syntactically similar languages and there is also a lot of training material.

The last presentation was from Google’s Macduff Hughes, who began by addressing an audience who had already been convinced on the superiority of neural networks for Japanese English translation. “Last year NMT was a rumor, 6 months it was the beginning, and now it is here”. Hughes took Spanish as an example of one of the best language pairs and analyzed how much better and fluent neural machine translation was in comparison to phrase-based. Gender was wrong because of length in SMT in several instances, but as neural absorbs the whole sentence, it neural fixes a lot of the small annoying errors in Spanish, though not all the time.

GNMT is not ready to handle tags yet (in fact no neural system can yet). Moderate amounts of in-domain data can adapt a model. The challenge is that it can be hard to evaluate, and also automatic training, stopping and scoring. So iin this respect, there is a lot of good work that has already been done in statistical systems that cannot be imported into neural networks so easily – a conundrum faced by all MT developers.

Interestingly, Hughes pointed to experiments that prove that source sentences meaning more or less the same thing can produce similar results, which points to the fact that a kind of interlingua has been developed. Knowledge can be transferred to chat or other Neural Networks understandings.

But interlingua is another story…

web and spider crawling down

A web of problems: Why Google Translate and website translation can’t marry

It is not news that machine translated websites are penalized by search engines. Google has developed its technologies on the back of reliable bilingual website crawling and freely available public data. After ditching rule-based engines (Systran) back in 2006, it embarked on a mission to use statistical machine translation (SMT) as a byproduct of its own data analysis. Websites that use machine translation to inform users are crawled and aligned, but those alignments provide data that adds dirt (read: uncertainty) which worsens the probabilities and hence the output (read: the translation). That is why Google Translate and website translation can’t marry.

A machine translated website will be penalized by Google, for it is dirty. It is also a proof of laziness on the part of those responsible. The search giant wants to analyze natural, human data. We recently bumped into an article on Slator.com that got our feathers all aflutter. In short, it proved the above point, which has been a known issue to translation companies and those offering proxy translation, often with the economical machine translation option.

web and spider crawling down

Nowadays, even e-commerce sites (see Magento help section on multilingual) do not recommend using machine translation for professional results and better ranking. It may sound ironical, but Search engines (read: Google) will penalize websites using Google Translate for their multilingual website.   Pangeanic has been a diehard advocate for quality website translation, developing Cor as a crawling and translation assistance technology that does not interfere with any of the code nor it provides machine translated output to Google’s algorithms. It checks your content, extracts the text and sends it out for translation. Whenever we hear a website or company will use raw proxy translation or simply Google Translate, we feel so sad. It is business lost, it is the cost of time wasted, wasted investment, having to face the wrong option was chosen some time ago, lose credibility… lose business and customers when the intention was to win.

Google’s violation guidelines

Google clearly bans automatically generated content (in order to avoid black hat SEO and similar techniques), including “Text translated by an automated tool without human review or curation before publishing”. Look for it in its violation guidelines. Thus, raw machine translation, unnatural results (and it is not difficult to detect a text has been produced by software) will bury your website deep in a web of penalization.   This kind of careless publication is viewed as spam or, worse still, copied or duplicated content.   You will find it hard to make up for it, unless you are prepared to probably do well what should have been done well in the first place. Follow this link to learn more about the dangers of duplicate content. 

But Pangeanic develops machine translation technologies, doesn’t it?

Yes we do. We are a well-known developer of machine translation technologies and language technologies. We use them in order to automate processes and it is particularly useful in controlled language situations, like instruction manuals and documentation for the automotive industry. It is extremely useful for gisting, to get a quick idea of what a text in a foreign language says at light speed. It also helps translators in certain situations to pre-translate and post-edit the content, which always needs a final verification in order to ensure to flows as natural language.   If your website is rather big (a large e-commerce site, for example, can contain tens of millions of words) and you decide to translate sections of your website using raw MT, there is quality option to consider. We can offer machine translation engines that are trained with your previous translations (aligned as reliable “translation memories”) which will speak your language in your style and will contain your terminology, specific to your products, services and industry. Creating engines with your own data, or customizing our own engines with your data and terminology will create better quality translations than general, online machine translation tools. Our expert translators can post-edit the content to make sure it conveys the message as it should.

Your website MUST PROVIDE VALUE

This is surely one of the most difficult things to do, but it is extremely important to search engines. Your content must be informative and engaging. Bouce rates are an indication of how visitors interact with your site, but a high bounce rate may not necessarily be an indication of a bad website. Some of your pages may offer the information the visitor was looking for. The visitor leaves without interacting because he /she found the information. Check this informative post by Yoast on why a high bounce rate is not necessarily a bad thing for your website. Maybe the person spent a minute, two, three or more reading it. A machine translated website simply does not offer the quality content nor the value website visitors want. 

Multilingual SEO strategy

Keywords cannot be machine translated, people search for different things in different places.

A simple keyword like “sneakers” can serve as an example (follow this article for a list of top ten disagreements between US and British English). It is widely used in the US, although more profusely in some areas than others. British English uses “trainers” (from “training shoes”. People looking for this kind of garment will not land on your page if you are using a different keyword – and so it happens with languages. Machine translated keywords just won’t work in other languages.   Pangeanic solves this challenge by specialist translators with a flare of marketing and aware of such issues. They use our website analysis and SEO tools (SEMRush, Google AdWords, etc.) in order to check the popular options in each country/region so you can make an informed decision about how to market your products from your website, and not use a general or direct translation.

5 tips to translate a website in many languages and embed it in your business strategy

by Manuel Herranz

Large enterprises and even SME’s around the world are realizing how important it is to translate a webpage in many languages.

1. A free website translator isn’t simply enough.

It may do the job fairly well if you just need to understand a website in another language, but that kind of automatic translation is not good enough when you are looking to attract customers.

2. Free website translations published as good content send the wrong message to your potential audience.

Google can be quoted as the best example. The search giant is very aware that it is the search engine of choice used around the world and it needs to be available to everyone. Since there are still billions of people who can’t read English or understand it, Google provides the option of translating websites and search results into the language they are familiar with – but this is a quick, on-the-fly HTML conversion for information purposes only.
If you want to establish a solid business presence in many countries around the world, then you need professional website translations as well.

3. Thanks to a multilingual website containing website translations of your original product descriptions in other languages, your target audience is much wider.

You have been targeting a particular audience since the inception of your business. Translation into several languages of your web content has many benefits and one could literally write a book on them. The number one, of course is that if your website has always been monolingual, then you were only communicating with people who understood your main language. With website translation you can rank in search engines and carry your message to people who don’t understand your web’s first language. It will actually make sense to them when they visit your website, click on their language button or tab and are able to read everything that is written on it.website translations increase SEO visibility

Brand image is the most important thing for any business in the world and brand image is not seen by looking at the size of a business or the quality of its products. Brand image is measured by looking at the people’s attitude towards a business.

A business can change people’s attitude towards it through its marketing efforts. Your brand image starts to build up as you start making some place in the hearts of your customers. That’s done through intelligent marketing, connecting advertisements and by personalizing your messages for them. When you translate a website into Spanish, you are opening up to an audience speaking the 2nd most spoken language in the world (500 million). Spanish has a strong presence not only in Europe and Latin America, but also in the United States – and brands are learning the power of marketing in US Spanish.

Pangeanic has a long relationship with Japanese companies. If you are Japanese and you decide to translate a website from Japanese to English will understand that it’s an amazing way of starting a connection with people from different corners of the world.

4. Better SEO and marketing results.

Start introducing new content on your website in multiple languages and you will see an increase in traffic and conversions – that is almost guaranteed. There are several strategies to do so, either with a multisite strategy or with a multilingual site.

Related Content – Learn more about multisite and multilingual sites for SEO:
3 Tips on translating a website and website localization

The more languages you add to a website, the more keywords search engines will detect on your site. OK, there is work to do in Analytics, regular publishing, geo-localization, website hosting speed, etc. But do not even think twice: the more languages a website contains, the higher the changes to be in the top spot in Google.

But rankings are just one objective. The point about Inbound Marketing is that your website will act as a point of reference, as a result of the knowledge it provides to its visitors. When you convert your customers into loyal customers, you can rest assured that your customers are not going anywhere else for many years to come. They will also review your services and provide testimonials. Customer loyalty is the added collateral benefit when you translate your website into different languages. You will benefit from a greater online reputation. Remember younger generations were born with the Internet. Reviews and comments, plus the corporate information you may add in their language are more relevant than marketing material in many cases.

With website translations (and not an automatic “translate website button” or a “webpage translator”) make your brand a part of people’s lives by connecting with them culturally.

5. Establish a long-term relationship with translation company.

Your website is most probably already published and content is up. And quite likely, the only place containing all the text that needs translating is… your website. It is typical for a website to develop over time. Pangeanic technologies can crawl your website and extract and text in a bilingual format for you to publish immediately, keeping a bilingual copy of all your linguistic assets.

If you publish content regularly, Our Cor technology will make it even easier for you to keep track of your publications automatically. You publish and Cor detects your new content, extracts it and sends it to a project manager or translator so it can be processed at the regular interval you require. Watch the video below to see Pangeanic’s crawler in action, keeping track of our own publications.

Lastly, if your content is high confidential and you need to translate a website but confidentiality is paramount before public release, our Client Portal makes it easy for you to upload content in a completely secure manner thanks to our encrypted solutions.

 

If you publish content regularly, Our Cor technology will make it even easier for you to keep track of your publications automatically. You publish and Cor detects your new content, extracts it and sends it to a project manager or translator so it can be processed at the regular interval you require

Machine translation: Can it be used to translate travel industry content?

by Manuel Herranz
There have been strong opinions for and against machine Translation over the last few years. Whilst the general public has become a keen user of free online services, professional translators have poured bitter criticisms against the technology. Understandably, because the language industry is a small industry compared with other sectors where automation took place years ago (automotive industry, printing, telecommunications, to name a few). The Internet and in general any industry based on electronic communications has added to the increase in demand for multilingual websites, which means more translation for eCommerce sites and website translations. There are many supporters of machine translation technology because of the many advantages and problems it has solved where a translator could not be at hand and human translation was not an option. See the video celebrating Google Translate’s 10 years. But it has also gained something of a bad press, particularly because the various free online translators (and I stress the free) on the web. If you read our articles in this blog often enough, you know by now that Pangeanic is a developer of a machine translation platform. We build engines for particular applications and clients. Our research team and collaboration over the last 8 years with Valencia’s Polytechnic, the Computer Science Institute in Valencia, the EU’s Expert Project and Spain’s Center for Technological Innovation (CDTI) has borne fruit in our Pangea version 3: an improved, state-of-the-art platform that not only automates the engine training and retraining process, but it also incorporates search engine capabilities in a hybrid translation memory + machine translation approach. Even so, we advise companies to be cautious when applying machine translation solutions as if it was all as easy as copying and pasting into a Google Translate panel. Free comes at a cost, and sometimes a very expensive cost.

Read more: Google is 10 years old

Some free solutions, like Google Translate can produce reasonable results in certain language combinations and mostly with English as the source or target language, some content types and translation between some languages simply will not lend itself to machine translation under any circumstance. Japanese and English is a language combination known to produce unintelligible results. We learnt a lot from our 2-year collaboration with Toshiba. Nevertheless, it is important to make a distinction between these free “translate any language” online tools and a custom-built enterprise machine translation solution.

The risk of no post-editing in travel sector translations

“Safety is important for us here at Novabikes” becomes “Safety above everything is important for Novabikes United States” – Google believes “us” is meant as “U.S.”

In this case, industry specific content and very often the client’s own terminology is used to build a tool for a specific purpose. Engines learn how the client (user) wants to translate and MT engines are trained with large quantities of language data. In other cases, hybrid approaches ensure further customization with specific client or language rules and translation memories.

Benefits of enterprise-level machine translation solutions

Nowadays, businesses can use machine translation for a wide range of purposes, including:

  • translating online help,
  • translating knowledge bases,
  • data collection,
  • customer services (multilingual chat systems),
  • emails in a different language to understand what a potential client is asking for (lead generation) – although articulating a marketing message with machine translation is another matter,
  • internal communications among multilingual staff (read more about how IBM used feedback from its own staff to improve its own MT solution),
  • and, in general, low value content for which there was no translation budget in the past.

Under the right conditions and processes, with a limited level of human input, MT can now deliver high quality translation almost in real time with a more than acceptable quality, akin to that of human translators. Companies trading internationally can considerably improve their translation productivity and language reach and, through it, their operations. A typical machine translation software like PangeaMT can turn around several tens of thousands of words per hour for immediate use or for human post-editors for light or heavy post-editing. Machine translation is now an option for content which previously was considered to have no ROI, for which there was never any budget or was overlooked because of time constraints.

Pangeanic’s Use Case: Translation services for the hospitality and tourism sector Ona Hotel Group

Translate travel industry content – The Machine Translation challenge

But are machine translation technologies good for the travel sector? The travel industry can throw up one hurdle after another from a translator’s point of view. However, scalability and response times are by far the number one challenge. Let’s take Booking.com as an example. They have well over half a million hotels in their inventory. Even with a short 50-word review on each (and most hotels have several reviews), we would be looking at over 27 million words of content in several languages. With several reviews per hotel and surely over time, this very conservative figure can double every year. Does it make sense for booking.com to use human translation services for all this content? Surely translation into any language of a exquisite New Year’s hotel menu is not what we have in mind when we think of machine translation in the travel industry. We know only too well at Pangeanic. Never publish any brand-level document that has to face the public and convey the idea of professionalism and service as raw MT output. Nevertheless, there are countless other documents which are prone for machine translation. They are not so client-facing or, again, are low-value because of they are ephemeral content. Common Sense Advisory is a Boston-based research center for the translation industry.  If you follow our blog, you are probably familiar with their “Can’t read Won’t buy” report which says that 88% of people are more likely to make a purchase if the information they see on the web is in their own language. In Europe, the EU published a report with similar conclusions: Europeans choose websites in their mother tongue and most do not feel comfortable making a purchasing decision in other language but their own. For many, understanding well what they were about to buy was more important than price. Positive or negative comments about a hotel or establishment have a deeper in a person’s native tongue. They boost engagement, even if the translation is not perfect. It is one of the mantras of selling: buyers are put off when they feel the seller has more information than them during the sales process. When the seller provides information and knowledge in order to bring the buyer to an equal footing, buyers are more likely to buy because they handle the same information or even more than the company they are buying from. Transparency is key, and machine translation is key in providing such information many times: it is immediate and it is neutral, as it is the product of an algorithm in a translation software. Thus, machine translation services enable the travel industry to reach new potential clients anywhere at a fraction of a cost of setting up a representative office. Also, the web centralizes and scales up their business. User generated content can serve the purpose of regular updates, which linked to a clever keyword strategy can bring several benefits, traffic and conversions. It is hardly news that around 80% of tourists and business travelers weigh reviews by fellow travelers before making a booking decision. And as machine translation can only get better with new techniques (neural networks or deep learning), MT is here to stay and to become embedded in all travel companies’ websites.

But… Shouldn’t all my multilingual web content have the highest translation quality?

No, not really. The traditional, client-facing publications that convey the image of your company as a brand, for sure.  This includes marketing brochures, hotel descriptions, reports, menus, magazines and newsletters, promotional material, in-flight entertainment, , user generated reviews, and of course social media posts. But user-generated content is mostly important as feedback and as collateral information to other users. Some may be very relevant, the majority of users’ comment, may not. And this is the reason why machine translation solutions go hand in hand with human post-editing solutions. It depends on the final desired quality expectation. Light post-editing will make the content fairly acceptable to humans but never the level of human, professional translation services. Full or heavy post-editing expects a deep revision in order to make the document indistinguishable from what would have been a human translation. This requires, often, the work of a first post-editor and a final proofreader. Heavy post-editing is ideal for content like resort and hotel descriptions, etc.

The Pangeanic difference

Here at Pangeanic, we appreciate that it’s vital for travel industry insiders to be able to translate languages fast and in an uncomplicated way. That is why our international hubs and online translation management platform cover the translation needs internationally across a variety of media channels and to cater to the needs of travelers at every point in the buyer journey. We offer cutting-edge translation technology and in-country linguists who can cover more than 150 languages, from Portuguese to Russian in Europe, from Japanese to Gujarati, Pashto and Indonesian in Asian. Our Chinese office can cover both Simplified Chinese, Traditional Chinese, Taiwanese. As one of the best US translation companies, with a UK translation company, and offices in Spain, Japan and China, Pangeanic specializes in localizing marketing material, hospitality and  hotel websites, ecommerce, travel apps, video, travel reviews and much more, so give us a call or email us today to find out what we could do for you. – See more at “Translations for the tourism industry

Next time you think languages, think Pangeanic
Translation Services, Translation Technologies, Machine Translation


Medical Translations: Quality Matters

by Manuel Herranz

When you think about two different jobs, doctors and translators do not come to mind as two related professions

But the fields of life science and medicine and translation services

do share at least one important feature: you never call the doctor until you need one. Likewise, you never search for translation services until you really need a translator. But the same could be said about the legal profession and legal translations or the engineering industry and technical translations. The translation industry is a multi-facetted industry and professional translators are supposed to be experts or knowledgeable about many fields. It is not a small industry either, and myths about the translation industry are disappearing as technology has been able to automate many processes. Ideally, experienced translators working in any of those particular areas of knowledge will help improve the conditions of each industry, and in particular the medical industry.

All translation agencies / translation companies offer different levels of service

These depend largely on the target audience, meaning and the final intended use and final readers. We offer 4 general translation levels at Pangeanic, plus others which may involve machine translation and lighter or heavier post-editing (revision by a human translator). Generally speaking, the more critical the application, the more eyes are going to read the document. The more serious the consequences in case of an error, the higher the translation level and the more verifications, quality control and translation reviews and stages a document will have. We could classify translation levels as follows

  1. Fast Translation (one linguist) This is purely a translation service for internal materials or when one translator and his/her own proofreading suffice. No serious publication is expected. The buyer of translation services can put up with some errors as speed of the delivery is more important than the beauty of the expression.
  2. Standard Translation Translators with experience in a given field produce a first, quality translation. They are quite familiar with the subject area and are able to verify their own translated document. This is later checked by an expert Project Manager and returned to the translator for final approval.
  3. Premium or High Quality Translation Services This translation service requires an expert translator to produce a highly technical, medical, legal translation. His/her version is then carefully checked and proofread by a different translator for terminology, style, expressions and accuracy. The proofread version is returned to a Project Manager who also has experience in the field and to the first translator for final approval. This process is rather lengthy, but it is required not only for the technical areas above, but also for marketing translations where a transcreation of the original is needed.
  4. Proofreading Only If the person writing has a high command of the target language, he or she may require a native speaker to read his or her first version

Clearly, all medical translations and life science translations would fall under the category of “Premium” or “High Quality Translation Services”. But why? Medical Translations - Quality Matters If the different levels in expected translation “quality” are perfectly understandable, the medical and life science industry deals with something that concerns us all: our health, and sometimes life and death treatments. Clinical Trials are undertaken many times under strict regulations and they have to be carried out in several countries, by different institutions or  laboratories in order to be approved at a national level. The EU, for example, has several directives for clinical trials in the case of  Medicinal products for human use. Clinical trials have to be authorized, they have to comply with certain levels of transparency and they have to be reported. If clinical trials are conducted outside the European Union, but submitted in an application for marketing authorization for countries in the Union, they still must follow the principles which are equivalent to the provisions of the Clinical Trials Directive. Expect this to be the same in most countries – and in different languages. This is one of the reasons why translation for the medical translation industry has to satisfy higher expectations. There is not only the potential failure in the trial itself, but any small translation error or misunderstanding, any medical terminology error can have devastating consequences in budgets and time-to-market, not to mention misuse of a drug, pharmaceutical or medical device. Quality matters in life science and medical translations if only because a large number of people, from doctors to nurses to consumers, in fact, the entire supply chain is placing their faith in the accuracy of the translation. It may sound a commonplace, but quality matters in medical translations because it is life-critical.

Translator selection process for medical translations

The requirements placed on the recruitment of medical translation experts are quite high compared to other disciplines and usual levels of stringency. Proven experience is a must-have for any life science and medical translators, and “being a doctor” is not usually enough. Few doctors have spare time outside their busy schedules to spend time translating. This used to be a sales point by some translation companies years ago, but the Internet has changed accessibility and working habits. Doctors and medical personnel seldom have enough time to catch up with the latest developments in the medical field. Considering the sheer diversity of clinical specializations, it is simply not realistic to expect every medical translation to be carried out by a linguistically skilled doctor. Expert medical translators are, precisely, expert translators. They have either a previous level of experience in translation and have specialized in the medical field or they gained experience during their Translation Studies degree at University and have proved themselves in the market for several years as in-house staff or freelance translators.

PANGEANIC’S USE CASE: Medical Translation for Clinical Trials

Quality Processes in medical translation services

The “experience” is not just time spent resolving word equivalents. It includes the whole translation process and familiarization with several translation tools, terminology databanks and QA tools that ensure that terminology has been adhered to and respected. When medical translation projects are big enough, the translation company must put a team to work. This means that the translation company must have a structure in place be have a good Internet connection, plus confidentiality agreements in place and the means to ensure that information is treated confidentially. Translators working remotely cannot keep a copy of the work. Only translation agencies with enough resources and technology can assemble an efficient team for translation, proofreading and terminology management and checking quickly.

Did you enjoy reading this blog? Read about our experience with medical devices and medical trials at: Life Science and Medical Translations