Google’s image-to-text translation in Barcelona

by Manuel Herranz

Something is moving in the convergence of handheld devices with integrated, on-demand technology. Social networking on the mobile phone has soon been transferred and Facebook, Twitter as well as other cool apps are almost omnipresent even in medium range mobile sets. After all, transferring a 21st century technology (social networking) to a 20th century invention (mobile phone), may present some technical difficulties, but as long as we are dealing with digital technologies, someone will find a way.

Language, however, is a different matter. Some bilingual apps for mobile phones have been available for some months (such as the impressive Sakhr’s Arabic app for the iPhone with speech recognition  or Jibbigo’s English-Spanish. Some are based on the old dictionary idea. Other, such as Toshiba’s (we reported in December 2009) want to integrate translation real time. Now, Google is also having a go at it.

At this week’s Mobile World Exhibition in Barcelona, Google has successfully demonstrated the prototype version of a new image recognition technology that can capture an image file into a mobile phone, scan it and translate a non-English text into English text. This is a quote from Andrew Gomez‘s Google blog (17th Feb). Andrew Gomez is an associate product-marketing manager at Google.

Imagine being in a foreign country staring at a restaurant menu you can’t understand, a waiter impatiently tapping his foot at your tableside. You, a vegetarian, have no idea whether you’re about to order spaghetti with meatballs or veggie pesto. What would you do? Well, eventually you might be able to take out your mobile phone, snap a photo with Google Goggles, and instantly view that menu translated into your language. Of course, that’s not possible today – but yesterday at the Mobile World Congress we demonstrated a prototype of Google Goggles that has the power to do just that. It’s still in an extremely early stage, but we thought we’d share this demo with you because it shows just how powerful a smartphone can be when it’s connected to our translation technologies.

The chairman and CEO of Google, Eric Schmidt and the company’s scientist Hartmut Neven demonstrated a prototype version of Google Goggles  recognition software merging with Google’s machine translation technologies in Barcelona. Schmidt laid out a vision for the future of mobile computing that could be distilled into a single phrase: Mobile first. He pointed out that within three years sales of smartphones will surpass sales of PCs. Secondly, he noted that in developing countries such as India where mobile phones are setting the entry threshold to technology and communication, Google searches were more likely to be made on a mobile phone than on a desktop computer; he highlighted the rescue stories from the aftermath of the Haitian earthquake and called the mobile technology that enabled some of them fundamental to the human existence. “This is all part of the same view that information is fundamental, and the joint view that mobile communication is ‘it’,” he said. During the demonstration, Schmidt took a picture of a German menu on a mobile phone and it was instantly translated from German into English.

“Today’s generation doesn’t call it a mobile phone; they call it a phone.” – Eric Schmidt

Neven said in the company blogpost that this prototype connects the phone’s camera to an optical character recognition  (OCR) engine, recognizes the image as text and then translates that text into English with Google Translate. He said that currently this technology only works for German-to-English translations and is not yet ready for prime time. On-demand translation was quoted as one of the service needs mobile cloud computing can address. “The basic message is pretty simple. The confluence of these three factors (computing, connectivity and the cloud) means your phone is your alter ego, an extension of everything we do,” Schmidt concluded. “Here, right now, we understand the new rule is ‘mobile first’ in everything. Perhaps the phrase should be ‘mobile first’ simply because it’s time to be proud of what we have built together. Our job is to make mobile be the answer to everything.

Google plans to eventually bring out Googles Googles that can translate all of the 52 languages currently supported by Google Translate.

Next time you think languages, think Pangeanic
Your Machine Translation Customization Solutions

   

3 thoughts on “Google’s image-to-text translation in Barcelona

  1. Jordi B.

    Small typo: “Schmidt took a picture” > “Neven took a picture”.

    As Mr. Schmidt pointed out, I think it makes sense to use the cloud for processor-hungry services like OCR, speech recognition or MT. Even the higher end smartphones have neither enough hard drive nor processor power to process all this offline.

    Now it’s the time for carriers to wake up and offer affordable data roaming. As things are, I would never even think of firing up Google Goggles to translate a restaurant menu if I’m abroad.

    Mr. Schdmit was pestered during the Q&A at MWC by carriers asking for a piece of revenues made by Google using their data pipes. Even the EU is asking Google for made-up taxes. Google is doing all the R&D, and they are behaving like shameless beggars.

    Reply
  2. Pingback: Now the real-time photo-to-translation video from MWC « Pangeanic Translation Technologies & News

Leave a Reply

Your email address will not be published. Required fields are marked *


two × = 4

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>