Companies creating and managing big data (big data very often means multilingual data, too), sooner or later realize it is in their interest to have direct access to machine translation technology rather depending on external technology or 3rd party plugins.
Why? I'm not talking about owning a translation company but the fact that employing other companies' machine translation technology signals you've given up technological independence in a core business area. And this may be a core activity that generates income, traffic and visibility, depends on the wilĺ of another company. This can have serious consequences for your business and there are plenty of examples of companies having a bad time after doing so.
The question is then, do Big Data companies want to own machine translation? Ebay's acquisition of Apptek made absolute sense. Ebay did not want to rely on 3rd parties for international business. Multilingual data, as generated by their users, is core to their business. In Korea, Samsung financed the acquisition of Systran via CSLI, until then a small MT player that provided them with Korean/Chinese/Japanese/English machine translation.
Systran is the biggest machine translation company in the world and the acquisition provides Samsung fast and efficient access to a larger number of languages whose MT processing/development would have been too slow or costly. Facebook has been using Bing Translator for some time, but also acquired another European machine translation start-up.
Other machine translation companies can brace themselves to be in someone's shopping list if they can prove enough solid technology. Who will be next?
Let us take what has happened between Twitter and Bing Translator from Microsoft as an example. News came this week that Twitter quietly stopped offering users the ability to instantly translate tweets using Bing’s machine translation feature.
One year ago, the company started using Microsoft’s technology, a general online translator. Tweets are particularly difficult to translate as they often contain abbreviations to make messages fit in 144 characters. Thus, users who had used the automated service began noticing the absence of the machine translation feature earlier this week, though Twitter has not specified when it stopped offering the service, nor the reasons why it took this decision. Users who want to get tweets translated from a foreign language will need to "go back to the past" and copy&paste the tweets into their own online or offline translation service.
Perhaps this will not be much of a problem for monolingual users. But even them may want to find out what a foreign soccer star, singer, or basketball player has tweeted. And because the nature of the tweets and character restrictions, Bing Translator provided translations that tended to range from slightly flawed to incomprehensible. Microsoft did not customize MT engines nor did any particular work for Twitter.
It is clear that some users will miss the translation capabilities. However, there has not been a massive outcry from around the web, pointing to the fact that machine translation as knowledge gathering is still more useful than for direct communication.
Nevertheless, we are facing an interesting move by Twitter, because Yelp added Bing Translator to its iPhone app this same week in order to provide translations of reviews. Therefore, it could be entirely possible that Twitter has decided to drop Bing Translator and grow their in-house solution as EBay once did. Maybe they want to evaluate a different other machine translation product. But for now, only one thing is clear: tweets will not be translated until further notice and we are still wondering why so many buys... whether big data companies want to own machine translation.