The Pangeanic Artificial Intelligence Project

Written by Carla Estefanía | 11/29/01

2024 Update

Since the start of the Pangeanic Artificial Intelligence Project, the company has made significant achievements in the field of Artificial Intelligence (AI), particularly in Natural Language Processing (NLP) and Machine Translation. Their AI accomplishments include the development of the ECO platform, which interprets and structures data from unstructured sources, enabling seamless data processing in various systems. This platform facilitates data, document, and email classification technologies, breaking down language barriers and data challenges. Pangeanic's AI models empower organizations to process information efficiently, enabling informed decision-making based on accurate and relevant data. Moreover, Pangeanic offers a range of AI services such as Deep Adaptive Machine Translation, Text and Data Classification, Data Masking and Anonymization, PII/Personal Data Discovery, and Integrated Summarization. Its famous data annotation tool is called PECAT.

Pangeanic's AI chatbot is called ECOChat, and it seamlessly integrates with existing websites using our custom LLM to interpret and structure data from unstructured sources. These chatbots can be built with website data in just one hour, ensuring a quick and efficient setup process without the need for data transfer to third parties. By injecting private documents and reports, Pangeanic's AI chatbots create full AI assistants that respect privacy and save significant time managing internal and external information searches, client inquiries, and staff queries. Moreover, Pangeanic's AI chatbots use public data to build a baseline chatbot, allowing users to control which documents and PDFs the chatbot interacts with. This integration enables organizations to communicate with users, employees, and clients in multiple languages using AI machine translation, saving thousands of hours in manual interactions. Additionally, Pangeanic offers customized solutions for multilingual knowledge management and translation, leveraging the generative properties of Large Language Models (LLMs) in complete privacy.

For many, Google Translate is an absolute problem-solver: copy-paste and translations appear just as magic.

However, naivety and misuse make free online translation applications bumpy.

Free online tools are just a demo version of what the technology can do and an excellent way to gather monolingual text. Moreover, their primary translating ability is limited to a mere select language, after which they pivot through English (translating from Japanese to German will require “bridging” through an initial translation into English from Japanese and then from English to German). That's the black magic behind all those language combinations.

Some languages are only comprehensible to a certain extent. For business owners looking to translate a website or needing massive amounts of data, the “free” result might be more than just a low grade in a foreign language class. More often than not, there is nothing more regrettable for an executive or representative running a presentation than to have his or her authoritative records translated ineffectually, or worse still, giving the impression of total reliance on machine-translation output without human editing.

The results can be a broken business bargain or an inappropriate solution. Pangeanic started its neural machine translation project in 2017 to alleviate this hazard. It has developed into an entire Artificial Intelligence project solving the limitations of Natural Language Processing technologies by applying AI.

Pangeanic is headquartered in Valencia (Spain) and has offices in Madrid and London. It was incorporated in the US in 2019, with head offices in Boston and San Francisco, plus sales representation in Chicago, Miami, and Houston. Pangeanic's founder, Manuel Herranz, understood that as the world becomes a social community, language transfer cannot be solved by significant solutions (Google Translate, Amazon, Bing Translator) because of privacy issues.

Many text user-generated content feeds back to content generators, affecting brand reputation, contracts, and international litigation and documentation between companies, social groups, networks, and nations. The need for private, lightning-fast language transfer is the main roadblock remaining. Pangeanic's mission through its technology division, PangeaMT, is to develop, implement, and utilize cutting-edge innovation - and that is what we call the Pangeanic Artificial Intelligence Project. The company has set out to decipher massive amounts of data, extensive legal records, Big Data, and large amounts of information. PangeaMT, its technology division, frequently supplies projects and digital infrastructures to the European Commission for implementation in Member States, like iADAATPA and NEC TM.

What is the Pangeanic Artificial Intelligence Project?

When one of the Pangeanic engines finishes the processing (language transfer), the output is improved and humanized following ISO standards by qualified linguists with the correct aptitudes to adapt machine neural outputs (translation) to human-quality language ready for publication. We call this process post-editing, which can happen in 2 different ways: light post-editing or deep post-editing. Light post-editing refers to a human only correcting the critical errors that would make a text incomprehensible to human readers. In contrast, deep post-editing refers to a critical reading that corrects all machine errors and eliminates all traces of "literal" or "machine translation."

So, where does the AI come in? Pangeanic machines learn from human input, adapting on the fly to human preferences. Two academic articles have been the result of our R&D

In short, each system creates a large, generic translation engine based on parallel corpora for MT Systems to which deep learning is applied. As the human corrects any machine errors or changes, the engine learns the human's preference and creates a "mirror" where all those preferences are stored. The large, generic engine can thus specialize in many subjects, each containing every user's preferences. Other areas where Pangeanic combines language processing and artificial intelligence with similar approaches are anonymization, key-data extraction, automatic categorization of texts, and summarization. All can be combined in particular projects and settings, and all are available from our team!

View full post