Try our custom LLM Masker

9 min read

03/11/2009

Crowdsourcing translation service

Excellent article on CrowdSourcing by Jost Zetzsche ( Issue 9-10-152)  and no less about the use of Twitter in our industry (Pangeanic was present in TAUS  Portland when the "no twitter allowed" message was issued - and we understand both sides of the argument). http://www.internationalwriters.com/toolkit/current.html which I reproduce below (skipping the ads). It is likely the "current" in the html will be updated next month, so the link will be old.

A computer newsletter for translation professionals

Busted!
 Did you ever realize how much truth-telling there can be in machine translation? I just found out when I tested the great hyperbolic promise "you never have to translate the same sentence twice" in Translation Party. I have rarely chortled such deep belly laughs.Other great computer-related news this week? Not really. Oh, yes, a new operating system! More on that and what it means to language professionals in the coming weeks.    
1. About Missed Opportunities
The following is a (slightly modified) article that I wrote for next month's ATA Chronicle. Since the discussions on crowdsourcing just seem to be getting more and more heated, I thought it would be good to premier the article in this week's newsletter. At the ATA conference in New York today (which I sadly have to miss this year), there will also be an ad-hoc session on this. Maybe this article will be provocative (?) enough to help those discussions along as well. Here goes:Crowdsourcing . . . a term so terrible, it makes the professional translator do one of two things: her knees go weak and her heart shudders, or he cries out (in a slightly hysterical tone), "Terrible translation quality! Terrible!"Our industry has been talking about this phenomenon for about two years now. Common Sense Advisory has even come up with a new acronym for it: CT3, or "community, crowdsourced, and collaborative translation;" I'm so glad they didn't propose C11g (remember L10n for Localization?)!In my opinion, it's high time to stop going in circles in our deliberations.Let me start discussing this with an example we are all familiar with and which I hope will illustrate my point: When translation environment tools (TEnTs) first hit the market in the early 1990s, they were relatively quickly adopted by technologically savvy end-clients and LSPs, but the majority of translators fell into a kind of shock mode. Many saw it as a real threat to the translation business as we knew it (and they were right) and, as such, highly undesirable (here they were wrong). Essentially, this new technology made the professional translator do one of two things: his knees went weak and his heart shuddered, or she cried out (in a slightly hysterical tone), "Terrible translation quality! Terrible!"The next step was that these tools were increasingly modified for their paying user group. Terminology management was not implemented from a translators' perspective but from that of the corporate or academic terminologist; new features were focused on things like project management rather than linguistic features such as morphology or improved translation memory searches; and the price of these tools was forbiddingly high. While this was not true for each of the TEnTs that was available, it was certainly true for the market-leading tools, some of which we still see today. Imagine for a moment what would have happened if we had embraced some or all of the technology offered by TEnTs (translation memory, terminology management, advanced text extraction, quality assurance, etc.) from the get-go. I would stick my neck out and say that we would have a different technology landscape today. Terminology management and integration into workflows would be easier (or would be used much more because it would have been easier at a much earlier point), morphological and syntactical features for a wide variety of languages would long have been implemented in both terminology and translation memory searches, and relatively recent developments like subsegment searches would long have been a natural part of our processes. And who knows what else we would have at this point.To me, the moral of this story is that we always have a choice not to become part of new developments in technology or processes, but our decisions carry consequences that might influence the way this technology or this process develops in the future -- almost independently of whether we embrace it at some later point.Let's switch back to crowdsourcing. While there has been a lot of attention recently on the crowdsourcing attempts by Facebook or the botched attempt by LinkedIn, some form of crowdsourcing has been around for a long time. Think of the translation of open-source software or the volunteer-translation of many shareware or freeware programs. Think of Microsoft's attempts to discuss terminology with its user base, dotSUB's translation of subtitles in videos, or even projects like the crowdsourced translation of Harry Potter novels. All of these and many other past and ongoing projects have felt non-threatening to us, so why are we now so up in arms about a concept that only seems new? I think it is because we are scared to lose something that we think is ours. We feel that our industry has some kind of inherent right to the translation of applications and websites of multi-billion-dollar companies like Facebook. After all, we are set up for it; we have the tools, the expertise, and the processes in place that could successfully accomplish these projects.The only problem is that Facebook apparently didn't think so. The Facebook management team thought it could create the "perfect" translation if its volunteer users translated it, making it just the way they wanted it, building the already strong relationship with Facebook into an even stronger one by giving them a sense of ownership (after all, they "created" their Spanish or German Facebooks through translation), and making them the best ambassadors imaginable for the site. While we all know that there is no "perfect" translation, Facebook might have just come pretty close to it, despite the many "mistakes" that our critical eyes might find in the translated versions of Facebook. Facebook created a value-added translation. And, by the way, it didn't do this on the cheap. Facebook invested a lot of money and research into creating an application that allows for the translation and voting system that is now in place for its own site, and has just been released for free for any partner site of Facebook.Is Facebook's example transferrable? I think in some cases it is -- for community-oriented products, for instance -- but for most other projects it is not. At least not in the way that Facebook does it. To harness the energy and knowledge of an enthusiastic crowd, which in turn makes it even more enthusiastic, you will obviously need to be able to start out with a certain kind of enthusiasm that most products and services cannot claim for themselves. (And even if that is potentially the case, as it was with LinkedIn, its undiplomatic and heavy-handed attempt at crowdsourcing and the subsequent formation of the LinkedIn group Translators Against Crowdsourcing for Commercial Business shows that there needs to be more than just an enthusiastic crowd.)So, it seems that two questions remain: Are projects like Facebook completely closed to professional translators, and is crowdsourcing applicable to other kinds of projects (and as an extension of that, would that be desirable)? I think the answer in both (or even all three) cases is yes. Community-based Facebook-like projects will stay closed to us only if we allow that to happen. Again, Facebook did not attempt this project as a money-saver, but because it saw the added value. Would projects like this benefit from professional experience in areas like translation techniques, terminology management, translation memory maintenance, and the various other skills that are part of our trade? Are you kidding? Of course they would! And it's up to us to offer that in a palatable way (and "palatable" is not a synonym for "free" or "cheap"!). What we have is unmatched expertise, a good track record, and a right to be treated in a professional manner (something that LinkedIn did not do). But what we do not have is the right to translate anyone's application, website, or product without engaging in a professional relationship with them.In what other areas is crowdsourcing applicable? The sudden emergence (or re-branding) of tools like Lingotek, Welocalize's CrowdSight (an add-on to GlobalSight), or Google Translation Center shows that there is an obvious need that goes beyond the niche market of social networking. Certain LSPs have shown us that some elements of crowdsourcing can be used in a professional setting. For example, one Austin LSP works on large, ongoing projects with almost immediate turnaround times. Rather than scheduling and organizing translators in a traditional one-translator-for-one-project way, they publish their projects on an internal site for large pools of translators, many of whom are guaranteed a certain amount of work and income every month provided that they check in on a regular basis. While this approach is not the same as the crowdsourcing that companies like Facebook offer -- only pre-qualified, professional translators are used (and paid for their work), there is no voting system for translations like Facebook offers, and it's a more strictly controlled system -- it uses elements of crowdsourcing and "translates" it into a strictly professional environment. Companies like Facebook might still decide to use their own methods and tools, but it might also be attractive to go with translation providers like the one mentioned above who can offer very-large-volume translations with extremely quick turnaround times. (After all, Facebook's highly publicized 24-hour turnaround-time for the translation into French was the starting point for this current wave of crowdsourcing.)There is much more that can be said about crowdsourcing, but hopefully this illustrates that we don't do well by flat-out rejecting "new" ideas and concepts within our industry. We need to take on leadership roles, and we can learn from new ideas and implement them into our own workflows. We mustn't forget the many opportunities we missed to play an important role in the early development of TEnTs. Let's not make the same mistake twice.So much for the article. In the meantime, Twitter has naturally announced its own crowdsourcing plans. But here is what struck me most about that. We are soo focused on the translation aspects of crowdsourcing (and rightly so -- after all, that's what we're here for, right?), but Twitter's concept of crowdsourcing goes a lot further. Is there anyone who can point me to the forums where software developers scream about the evils of that kind of crowdsourcing?
2. We Are the World, We Are the Children . . .
I had a fabulous, and slightly scary, encounter with Twitter last week -- actually, I don't tweet; Jeromobot does.You see, Jeromobot and I are "conferenced out" for the year. It's been fun and interesting, but it's also good to be home every once in a while (and make some money). Still, there are a lot of interesting meetings happening these days. Of course, the ATA is meeting this week, the German tekom is next week, a TAUS conference is taking place this week in Portland (and how often do we have language conferences in my home state?), and last week was a particularly engaging and interesting Localization World conference in Silicon Valley. How do I know this about LocWorld? I was able to follow it on Twitter! A number of eager Twitter users (in particular Renato Beninatto, Kirti Vashee, and John Yunker) brought all the news worth following (and some that wasn't) in realtime right to Winchester Bay, Oregon.The LocWorld talk that may have stirred the most interest was a presentation by Jeff Chin and Mike Galvez from Google about the Google Translator Toolkit (GTT). Here are some of the things I learned; for other information you can, of course, read earlier Tool Kit newsletters. (Notice the similarity in the naming of those products. My wife asked me the other day whether I had already sued Google. I told her I'm still holding out on that.)
  • GTT has added support for the translation of AdWords through the AdWords portal -- some commentators said that this will be a way to finance GTT. (Yeah, right, they really need that!) There is also some information on this right here.
  • Google apparently believes that the target users of GTT are primarily amateurs. For instance, they pointed out that GTT is not as "good as professional tools."
  • Marco Trombetti (of MyMemory on Translated.net -- the site that collects translation memories and allows you to download TM data) was involved in the evaluation of the GTT interface. As a side note, there is an interesting translation conundrum on Marco's site. The English slogan "Get a better translation with 216.893.775 human contributions" is translated into German as "Bessere Übersetzungen durch die Mitwirkung von 216.893.775 Menschen." That's a lot of people helping!
  • Coming developments for GTT include an integration with more sites than Wikipedia and Knol to directly translate web content, the ability to download your own TMs and glossaries, and the support of OpenOffice formats.
See, I (and now you) just saved a few thousand dollars by not having to go in person.Oh, and if you still don't think this proves that Twitter can play an important role, consider this: This week's Portland TAUS conference, whose closing remarks will be made as you receive this newsletter, expressly forbade its participants to tweet during the conference -- not because it would be distracting, but because the conference organizers were afraid that information would be dispersed too freely. . . .
3. The Announcement of Things to Come (Really this Time?)
Lionbridge has been talking a lot about its plans to release some part of the Freeway offering, which includes their own proprietary TEnT Logoport, to the rest of the industry. My feeling is that they regret having talked so much about it, because so far all that talk has not really been followed up by much action. Not only has this not looked particularly good to the rest of our industry, but it has also put Lionbridge into a somewhat awkward position with competitors such as Welocalize who have been more low-key on the one hand and yet much more successful at rolling out their solutions as an open-source solution to the general public. Anyway, this week another announcement was made by Lionbridge stating that . . . there is going to be something next year. Say what you want, but I think it takes a healthy dose of chutzpah to once again make a big deal about something that's not quite ready!Still, it seems that they are much more confident about this one and seem to have a clear idea of what it is going to be. I sat down with Clove Lynch from Lionbridge to talk about it.
Next time you think languages, think Pangeanic Translation Services, Translation Technologies, Machine Translation