Tag Archives: english

Help translation project to protect global internet freedom

The following statement has been written by Ellery Biddle in the Advocacy section of Global Voices. We urge all our readers to share it and understand the serious issue behind Internet governance at stake. If Internet can be controlled in the way it is proposed and its openness constrained, our global rights as netcitizens will be compromised. Below follows Ellery’s call, as it appears in the blog.

“Over the next seven days, Global Voices Lingua volunteers will be translating a public online petition that supports the protection of human rights online and urges government members of the International Telecommunication Union (ITU) to preserve Internet openness at the upcoming conference of the ITU.

Open for sign-on by any individual or civil society organization, the Protect Global Internet Freedom statement reads as follows:

On December 3rd, the world’s governments will meet to update a key treaty of a UN agency called the International Telecommunication Union (ITU). Some governments are proposing to extend ITU authority to Internet governance in ways that could threaten Internet openness and innovation, increase access costs, and erode human rights online. We call on civil society organizations and citizens of all nations to sign the following Statement to Protect Global Internet Freedom:

Internet governance decisions should be made in a transparent manner with genuine multistakeholder participation from civil society, governments, and the private sector. We call on the ITU and its member states to embrace transparency and reject any proposals that might expand ITU authority to areas of Internet governance that threaten the exercise of human rights online.

To sign the petition, visit the Protect Global Internet Freedom website. To sign, enter your first name, last name, email address, organization name (if you are signing on behalf of a civil society organization), organization URL, and select your country.

All translations will also be posted on the petition site, which is hosted by OpenMedia, a Canada- based digital rights group.

As translations appear (see above), please feel encouraged to share links on social networks and with friends!”.

Next time you think languages, think Pangeanic
Machine Translation Engines from PangeaMT

follow us on –> Follow manuelhrrnz on Twitter  @Pangeanic   @manuelhrrnz

I want you to speak English or get out

EU reduces translation budget – Machine Translation and Post-editing, one future

by Manuel Herranz

On 21st November 2012, lawmakers approved a report by Stanimir Ilchev, a Bulgarian Liberal MEP, that will bring change to the procedural rules recording plenary debates. This decision could be a Godsend for machine translation and language technology developers as the EU plans to increase translation productivity (or times) by 25% – this being a target in current R&D Language Technology Funding Calls.

Starting from the next plenary, on 10th December, the European Parliament is not going to be required to translate the session into all the 23 official languages of the EU. Over the years, this requirement has proved quite costly and can take up to four months. However, a bias towards the English language has been pointed to in many circles and instances. For example, Jean Quatremer, a renowned French political journalist from the French daily Libération, complained about the official press statements containing the Commission’s economic recommendations to member states, published on 30th May 2012. These statements had been eagerly awaited by the press because of the euro debt crisis, but initially were only made available to journalists in English. The translations into other languages followed a few hours later that day. Mr. Quatremer said that initial monolingual release provided the Anglo-Saxon press with an “incredible competitive advantage” and it threw into doubt the institutions’ democratic legitimacy, making very clear his position on a very strong-worded blog entry

From December 2012, the EU legislative will only record proceedings in the original language of the speaker. Nevertheless, the proceedings will still be required to be translated into a particular language if there is a request by a member state.  However, in the European Parliament many official press statements are currently published only in English and a very limited amount of them are translated in other languages – despite huge efforts and money invested into translation services and increasingly, in machine translation technology.

“This is one of our struggles – that the press releases and all publications and communications with society (tenders, contracts, etc.) are translated,” said Miguel Angel Martinez Martinez, the Parliament’s Vice-President in charge of multilingualism.

Numbers speak for themselves: 72% of all EU documents are drafted in English, with French coming a far second with 12%. Only 3% are originally drafted in German. On the other hand, 88% of the users of the Commission’s Europa website speak English. In reality, “providing documents in English, French, German, Spanish and Italian would cover close to 100% of all the EU’s linguistic needs”, said the DG Translation Director-General Lönnroth, speaking at a debate hosted by the Centre for European Policy Studies on 22nd February. The Union “will just have to cope” with increasing linguistic pressures brought on by future enlargements because “no decision-maker would dare to touch the main principles” of the EU’s language policy.

Mr Ilchev rejected proposals to translate the sessions only in English, as it would “appear linguistically unjust”. In the current EU, having 23 official languages means 506 translation and interpreting combinations, said Translation Director-General Lönnroth, a figure which can increase significantly when Croatia, Serbia join, and even Turkey in the foreseeable future.
Acknowledging he is not a “language fanatic”, the director-general claimed he thinks “about how to reduce the workload every day” as it was “not in the taxpayer’s interest” to provide every language combination. Lönnroth said back in February that “it would be easier if everybody accepted that English and French were the main EU languages”.  This is what (partially) is going to happen, although Mr. Ilchev assures that the initiative will not harm multilingualism, a principle enshrined in EU treaties: “of course this principle is not in question and everyone can listen to our debates in plenary in their own language” – through interpretation. Some of the EU’s research funding actually goes into technology solutions and research. For example, the SUMMAT project aims at creating an online service for subtitling by machine translation.

Next time you think languages, think Pangeanic
Machine Translation Engines from PangeaMT

follow us on –> Follow manuelhrrnz on Twitter  @Pangeanic   @manuelhrrnz

Facebook to add automated translation services for posts

With over 750 million accounts, Facebook users span nearly every country in the world -and it has been ranked as the 7th most populated country in the world.  It was recently valued at 65,000 million US$. Not bad for a company that lets people chat and share pictures at its basic level, and that connects at its highest.

It sucess has brought shadows to other online companies such as Yahoo!, which had to fire its MD Carol Bartz by phone, as the company struggles to keep up with other online giants.

Facebook faces a common “problem” many large countries do:  multilinguism (if you consider that a problem) or rather the fact that it holds communities which do not interact with each other as there is a language barrier (that is a problem in the real world and in the digital world).

However, according to an Inside Facebook post on 2nd September, the social media site has started to experiment with an automated translation service to help bridge the communication gap between its communities.  Facebook  already crowdsourced the translation of its site to several languages, connecting millions of people to each other around the world in new and sometimes unexpected ways.

The new “Translate” button sits next to the “Like” button and apparently does a good job of translating not only standard words but also slang phrases. No details about the engine or technology behind the tool have been disclosed, although it is likely the company, giving its “crowdsource” philosophy, may have adopted open-source technologies rather than choosing to develop one from scratch.

The picture below (courtesy of Inside Facebook) shows a translation of the phrase, “Totally cool” from Hebrew to English.

As it happens with other life translation services, if the post has been translated, the button changes its status to “Original” and thus users can see the source text that was originally entered.

This machine translation functionality will undoubtedly be particularly useful for multinational organizations.  Their Facebook pages receive comments from all over the world, which are of course in different languages.  Gathering this wealth of real-life user feedback is the realm of some sentiment analysis firms.  Twitter and blogs are open-web resources for commercial firms to know what customers think and how they react to their products, services and events.  However, Facebook is a close-web environment which cannot be crawled, nor data mined so easily.  Therefore, the only solution has been for people to cut and paste the comments into a free web translation service like Firefox’s imtranslator or for the most sophisticated or corporate user which requires a private translation process build your own.  Facebook’s embedded machine translation feature would then be a timesaver not just for users, but also for commercial applications needing to know close-web or community-based opinions. Facebook’s translation currently only supports Spanish, French, Hebrew, Chinese, and English.

The feature is currently only available on Facebook Pages and not on profiles or apps. Nevertheless, if it works as Facebook plans, we can expect to see it rolled out for the entire site in the near future.

Audience Growth on Facebook: Top 25 Country Markets

All displayed data current as of July 1, 2011.  (Courtesy of Inside Facebook)

Pangeanic’s participation in TAUS Copenhagen 2010

by Elia Yuste

TAUS has been tracking the exciting experiences of companies pioneering in a radical new MT engine training space for the last year or so. Pangeanic is one of the most outstanding cases, and so we were advertised as the first LSP to create a new business stream with TAUS Data Association (TDA) data earlier on this year. Then, PangeaMT, Pangeanic´s technological division geared at customized MT solutions and consulting, was invited to take part in the proof-of-concept of TAUS MT Trainer and present its results on the occasion of the TAUS Executive Forum in Copenhagen in late May 2010.

The idea behind this MT Trainer, a web-based facility from TAUS TDA that will materialise within the current year, is twofold: first, to foster pro-active adoption of TDA data for MT engine training; and second, to connect MT service commissioners and providers under the TAUS umbrella, whereby the former may submit their data files (reference files for engine training and files for translation) and the latter would turn around the MT output in a short time. The MT Trainer has a counterpart facility called MT Evaluator, which lets the commissioner or client evaluate the uploaded MT output by means of standard metrics-based figures.

To test the viability of such double initiative, the so-called MT Trainer pilot was discussed among the selected partners and then launched about two weeks before the Copenhagen meeting. Would it be possible to automate workflow for MT customization using client data and data from TDA? On the one hand, Adobe, eBay and McAfee were the three prospective MT commissioners seeking trained engines and metrics to measure the quality of output. On the other, Languagelens, PangeaMT, and Tilde were the three selected MT companies. We all could turn around customized MT engines in 24 hours or less, from which the output was measured for quality using BLEU scores. In the specific case of Pangeanic, the challenges of speed and acceptable quality could be met without any problem.

If these two TDA service offerings, the MT Trainer and Evaluator, get well accepted and regularly deployed by members, it will instigate more data uploads/downloads and reinforce the usefulness and applicability of relevant, domain-specific data sharing for MT training. This should also lead to a much more desired increase in memberships and overall member pro-activity within TAUS.  For Pangeanic it will mean more visibility in the MT arena, a quicker access to high-calibre clients, whose content and domain specificities are btw. already familiar to us, and a controlled workspace to offer our MT services.

Apart from the MT Trainer & Evaluator proof-of-concept, the Copenhagen event gave rise to lots of fruitful discussions among MT practioners and newcomers. In our case, apart from describing the ins and outs of our engine training experience for eBay under the MT Trainer pilot scenario, we engaged in interesting conversations about how PangeaMT has been able to overcome Moses shortcomings. Our TMX filter or inline mark-up parser were acclaimed features that are much needed in our industry and have made us stand out of the (S)MT crowd.

Other takeaways of the TAUS Copenhagen event were the convergence of MT, open platforms and contexts of application (e.g. in corporate support), learning more about TAUS TDA member experiences, and gathering collective wisdom resulting from future-projecting, table discussions on a number of hot language industry topics. A full report about the event can be found here and also downloaded from the TAUS website.

Next time you think languages, think Pangeanic

Follow manuelhrrnz on Twitter

For the Advancement of Arabic/English Machine-Translation Technology (and others): IBM and KACST

The Saudi Arabian National Research and Development Organization announced yesterday a multi-year agreement to collaborate on, amongst other areas, the advancement of Machine-Translation technologies.

Under terms of the agreement, King Abdulaziz City for Science and Technology (KACST) will purchase an IBM Blue Gene supercomputer that will enable its researchers to perform complex simulations and computational modelling.

The software giant will provide training services to KACST researchers on the functionality and features of Statistical Machine Translation technology. IBM’s Research and Development team will be in charge of building the machine translation system with the initial basic system capabilities, which will be trained with several million words of data – the basis of the translation training process.

IBM will commit researchers, and business consultants and KACST scientists will work together to further enhance the IBM Machine Translation Engine into a powerful translation engine to translate Arabic to other languages. This project deals with natural language analysis and computational methods for language translation. Technologies used for machine translation, such as syntactic parsing and word sense disambiguation, are commonly used in other applications of natural language processing.

The agreement is one of several joint research projects undertaken between both organizations.

Help will also be provided on Intellectual Property Development management so that IBM’s expertise is used to help KACST tools and processes that turn its inventions into patents.

The agreement also includes collaboration to create the National Center for Women Engineers.

Full story has been reported in several sites: zawya.com, ameinfo.com, and the US-Saudia Arabian Business Council.

Next time you think languages, think Pangeanic.

Canada Government launches Language Portal, free access to Termium terminology database

The Government of Canada has launched the Language Portal of Canada and free access to its Termium terminology database. Termium is available in English, French and Spanish, although most of the content is available only in former two languages.

Provincial Secretary and Minister responsible for Francophone Affairs June Draude said: “Enhancing French-language services will not only benefit our French-speaking citizens here at home, but will extend a warm welcome to other French speakers from Canada and across the world.”

The site has been designed to be a showcase for the French and English languages in Canada and also to promote the use of official languages in the country. The Language Portal features sections on dictionary resources, language use in Canada (official and other), language professions (from translation and interpretation to writing and editing), language usage tips, and a special section on “la Francophonie in Canada” and internationally.

The site also provides information on everything from grammar and spelling, to proverbs, punctuation and typography, as well as links to a Clear and Effective Communication section on the Translation Bureau’s main site.

Next time you think languages, think Pangeanic