When terminology becomes infrastructure: EcoDrive TermSpace and the semantic layer Europe needs for AI

Written by Manuel Herranz | 06/12/26

Europe’s AI problem is also a problem of meaning.

Data spaces cannot create value if the organizations sharing data do not share the same language. AI models cannot reason reliably over specialized domains if the concepts behind the words remain vague, duplicated or inconsistent. A city cannot build intelligent mobility if traffic data, charging infrastructure, vehicle systems, public services and urban sensors speak in disconnected dialects.

This is the point behind EcoDrive TermSpace, the project developed by Pangeanic, Universitat Jaume I and ValgrAI to build a terminological data space for sustainable and autonomous mobility. It is also the reason why a subject that once looked academic is now appearing in newspapers, institutional project pages and public AI discussions.

On 12 June 2026, Levante-EMV published an opinion article by María Amparo Alcina Caudet, from Universitat Jaume I and AMIT, titled “El lenguaje que nos mueve: terminología, inteligencia artificial y espacios de datos para la movilidad del futuro”. The article states the central idea with unusual clarity: data, by itself, does not generate knowledge. It needs context, meaning and a shared form of interpretation.

That sentence could serve as a manifesto for the next phase of enterprise AI.

Data spaces need shared meaning

The European discussion around data spaces often focuses on infrastructure: connectors, governance, cybersecurity, interoperability standards, access control, federated data sharing and sovereignty. All of that is necessary. None of it is sufficient.

A data space may connect organizations technically and still fail semantically. One public authority may describe an electric vehicle charging point in one way, a transport operator in another, a manufacturer with different technical taxonomies, and a software provider through yet another data model. The files may circulate, the APIs may respond, the dashboards may populate, and still the system may not understand that different words are pointing to the same concept or that the same word has different meanings depending on the context.

This is the quiet fracture inside many AI and data initiatives. The pipes work. Meaning leaks.

EcoDrive TermSpace addresses that fracture by treating terminology as data. It transforms specialized language in sustainable mobility and autonomous vehicles into a structured digital resource: definitions, variants, multilingual equivalences, semantic relations and contexts of use.

In practical terms, it builds a layer between texts, data and algorithms so that people and systems can share meaning, not just files.

From urban mobility to AI infrastructure

On 27 May 2026, ValgrAI presented the Urban Mobility Data Hub in València, a data infrastructure designed to integrate real time information from traffic, public transport, urban sensors and municipal services. Valencia Plaza described the hub as an interoperable data ecosystem for transforming urban mobility through artificial intelligence and better decision making.

The same event also presented EcoDrive TermSpace, promoted by Pangeanic and Universitat Jaume I in collaboration with ValgrAI. The project addresses a problem that grows as mobility becomes more digital: the lack of terminological interoperability across automotive, energy, infrastructure, public administration and artificial intelligence.

ElPeriodic summarized the contribution well. EcoDrive TermSpace turns specialized language into a structured and reusable digital resource, with definitions, variants, multilingual equivalences and contexts of use. That semantic layer connects texts, data and algorithms and supports specialized machine translation, intelligent information retrieval and interoperability between data platforms.

This is where the project becomes relevant beyond mobility. Every regulated industry has the same problem.

Healthcare has clinical terms, procedures, devices, diagnoses and coding systems. Finance has instruments, risk categories, reporting obligations and legal definitions. Energy has grid components, generation assets, market rules and technical documentation. Public administration has forms, procedures, case types and legal categories. Manufacturing has product families, parts, manuals, safety procedures and multilingual supplier documentation.

Each sector contains language that looks ordinary until an AI system has to act on it.

What is a terminological data space?

A terminological data space is a governed environment where specialized terms are managed as structured data. It does not merely collect words. It represents concepts.

A good terminological data space includes definitions that explain the concept, variants that capture how the same concept appears in real documents, multilingual equivalents that connect concepts across languages, contexts of use that show how terms behave in domain material, semantic relations that connect concepts hierarchically or functionally, provenance that records where terms and definitions come from, and validation workflows that allow experts to review, correct and improve the resource.

In EcoDrive TermSpace, this logic is applied to sustainable and autonomous mobility. The project uses the ONTODIC model developed by the TecnoLeTTRA research group at Universitat Jaume I as the basis for creating, managing and exploiting linguistic and terminological resources. The ValgrAI project page describes the objective as the creation of a sectoral and interoperable data space dedicated to terminology for sustainable autonomous mobility, supporting use cases related to autonomous vehicles, sustainable transport and intelligent mobility systems.

This is a precise example of what Europe needs more broadly: AI resources that are structured, multilingual, reusable, validated and governed.

The technical relevance for AI builders

Large language models are very good at producing plausible language. They are less reliable when a specialized decision depends on the exact meaning of a domain term.

A model may know that a “cell” can refer to a battery cell, a prison cell, a spreadsheet cell or a biological cell. A mobility system cannot afford that ambiguity. A battery management system, a safety manual, a municipal mobility platform or an autonomous driving dataset requires conceptual precision.

Terminology gives AI systems a controlled relationship between words and concepts. Ontologies go further by connecting those concepts into a knowledge structure. Together, they support specialized machine translation, because terminology and conceptual relations help preserve meaning across languages; information retrieval, because search becomes concept based rather than string based; document classification, because texts can be organized according to validated domain concepts; retrieval augmented generation, because the retrieval layer can point to structured knowledge rather than loosely related fragments; fine tuning and evaluation, because the training and test data can reflect the actual language of the domain; and model alignment, because human feedback can be tied to explicit terminology, policy, domain and quality criteria.

This is the reason terminology has become part of AI infrastructure. It is one of the layers that turns a generic model into a useful system.

The connection with small task specific AI models

Gartner predicted in 2025 that by 2027 organizations would use small, task specific AI models at least three times more than general purpose large language models. The logic is straightforward: many enterprise workflows require accuracy, context and operational efficiency rather than general conversational breadth.

That shift gives new importance to data preparation. Gartner also points to preparation, quality checks, versioning and management of enterprise data as key requirements for fine tuning and task specific AI.

EcoDrive TermSpace belongs to that same movement. It shows how specialized knowledge can be transformed into structured data that improves downstream AI systems. The project is not about creating a larger model. It is about creating better conditions for models to operate inside a domain.

For Pangeanic, this is a familiar path. Our work began in multilingual data for machine translation and evolved into AI data operations, annotation, evaluation, model alignment and sovereign deployment. We have supplied, cleaned, aligned, evaluated and operationalized language data for more than two decades. The formats change. The fundamental problem remains constant: AI performance depends on the quality and structure of the data behind it.

EcoDrive TermSpace in the public record

Several public sources now describe the project from complementary angles.

Levante-EMV explains the intellectual foundation: data needs context, meaning and a shared form of interpretation.

Valencia Plaza places EcoDrive TermSpace inside the broader presentation of the Urban Mobility Data Hub and highlights the challenge of terminology in smart mobility.

ElPeriodic connects the project to real time mobility data, AI, interoperability, specialized translation and intelligent information retrieval.

ValgrAI identifies the project as INREED/2024/2, with Pangeanic as coordinator, Universitat Jaume I as participant and ValgrAI as subcontracted partner.

TecnoLeTTRA provides the academic and methodological context around language technologies, terminology and knowledge representation.

Together, these sources create an evidence trail that is valuable for search, AI citation and institutional trust. More importantly, they show that the topic has left the seminar room. Semantic interoperability is now part of the public conversation about AI, mobility and data spaces.

What this means for enterprises building AI

EcoDrive TermSpace is a mobility project, but the architecture is transferable.

Any organization working with specialized language faces the same question: how do we make our documents, terms, procedures, classifications, metadata and multilingual resources usable by AI systems?

For some organizations, the answer is a terminology data space. For others, it is an ontology backed knowledge base, a multilingual dataset, a domain corpus for fine tuning, an evaluation benchmark, a retrieval layer, a glossary controlled translation system or a complete AI Data Operations workflow.

Pangeanic builds these semantic and multilingual data layers for organizations that need AI to work with their own domain language.

That work can include terminology extraction and validation from real domain corpora, ontology engineering to structure concepts and relations, multilingual equivalence management for translation, search and AI grounding, bespoke AI data collection for domains where generic datasets do not exist, evaluation sets and quality gates for model comparison and continuous improvement, human feedback and model alignment for task, language and policy specific behavior, and secure deployment in private, on premise or controlled environments when public cloud pipelines are not acceptable.

This connects directly with Pangeanic’s AI Data Operations, datasets for AI, bespoke AI data collection, Deep Adaptive AI Translation and our model alignment work with the Barcelona Supercomputing Center.

The commercial implication is clear. AI buyers do not only need access to models. They need the data layer that makes models useful in production.

Europe’s AI advantage may be semantic

The global AI race is often described through chips, model size and capital expenditure. Those forces are real. They are not the full picture.

Europe has another advantage: multilingual complexity, regulatory discipline, public research networks, domain expertise and a long institutional memory around terminology, translation, standards and knowledge organization. What sometimes looks like European friction may become European infrastructure.

A continent that works across languages, administrations, industries and legal systems has had to learn the difficult art of shared meaning. AI now makes that discipline economically valuable.

EcoDrive TermSpace is a small project in the scale of global AI, but it points to a much larger principle. Data spaces require semantics. Small task specific models require curated domain data. Sovereign AI requires governed knowledge resources that organizations can inspect, improve and operate under their own rules.

The future of AI will not be built only by training larger models on larger datasets. It will also be built by organizing specialized knowledge with enough rigor that machines can use it without flattening its meaning.

That is why terminology has become infrastructure.

Frequently asked questions

What is EcoDrive TermSpace?

EcoDrive TermSpace is a terminological data space for sustainable and autonomous mobility, developed by Pangeanic, Universitat Jaume I and ValgrAI. It transforms specialized mobility terminology into structured AI data, including definitions, multilingual equivalents, variants, contexts of use and ontology based semantic relations.

What is a terminological data space?

A terminological data space is a governed and interoperable environment where the specialized terms of a domain are managed as structured data. It includes definitions, conceptual relations, multilingual equivalences and provenance so that organizations and AI systems can share meaning across platforms, languages and workflows.

Why do AI systems need structured terminology?

AI systems need structured terminology because domain accuracy depends on meaning, not only word prediction. Terminology helps disambiguate concepts, preserve meaning across languages, improve retrieval, guide fine tuning, support evaluation and reduce errors in specialized tasks.

How do terminology and ontologies improve data spaces?

Terminology identifies and defines domain concepts. Ontologies connect those concepts through formal relations. Together, they make data spaces more interoperable because organizations can exchange not only data, but also a shared understanding of what the data means.

Can Pangeanic build semantic data spaces for other sectors?

Yes. The EcoDrive TermSpace approach can be transferred to sectors such as energy, healthcare, finance, legal services, manufacturing, public administration and defense. Pangeanic combines terminology extraction, ontology engineering, multilingual AI data operations, evaluation and secure deployment to help organizations build domain specific AI systems.

View full post