From Small Models to Sovereign AI: Pangeanic Returns to the ValgrAI Scientific Council Forum

Written by Ainara García | 07/01/26

Sovereign AI · ValgrAI Scientific Council

Manuel Herranz will take part in VSCF 2026 with a presentation on sovereign AI, ontologies, small models and the data supply chain behind the generative economy.

July 2, 2026 · Universitat Politècnica de València · View the VSCF 2026 program

Sovereign AI begins long before the data center. It begins with the information an organization controls, the ontologies through which it structures institutional knowledge and the specialized models it can evaluate, adapt and operate under its own technical, legal and economic conditions.

Manuel Herranz, founder and CEO of Pangeanic, will take part in the ValgrAI Scientific Council Forum 2026 on Thursday, July 2, at the main auditorium of the Universitat Politècnica de València.

His presentation, entitled “Sovereign AI: Ontologies, Small Models and Tokens Are the New Coal”, will form part of an international program examining the evolution of artificial intelligence and its application across scientific, industrial and social environments.

The program includes contributions from Virginia Dignum, Hiroaki Kitano, Tom Dietterich, José María Azorín and Ramón López de Mántaras, among other international specialists. Pangeanic is also supporting the forum, contributing a perspective built through language technology, data for AI, model alignment and the deployment of AI systems in enterprise and institutional environments.

Technology continuity

A trajectory connecting models, data and knowledge

The 2026 presentation continues a line of work that Pangeanic began in language technology and has progressively extended toward specialized, governable AI systems adapted to the knowledge of individual organizations.

VSCF 2025

Small models for specific applications

Manuel Herranz examined the rise of smaller specialized models as a suitable architecture for clearly defined enterprise tasks, with lower computational requirements and greater operational control.

SAVIA project

Research and technology transfer

Collaboration with ValgrAI and Universitat Jaume I transferred this vision into applied research and helped connect academic knowledge with enterprise use.

VSCF 2026

Data, ontologies and operational sovereignty

The new presentation expands the discussion toward data control, structured knowledge and the ability to operate AI systems under an organization’s own criteria.

Specialized models

Small models for specific enterprise tasks

During the 2025 edition of the forum, Manuel Herranz discussed the growing role of small models as an architecture particularly well suited to specific enterprise applications.

Small language models and other specialized models can offer significant advantages when the problem is clearly defined: lower computational consumption, faster response times, easier adaptation and more controllable deployment within private infrastructure.

Their usefulness increases when they are trained, fine tuned or connected to representative domain data and evaluated against precise operational criteria. A model designed to interpret legal documentation, classify files, translate technical content or assist an industrial process needs its own terminology, documents, instructions, examples, constraints and quality standards.

Pangeanic develops this layer through AI Data Operations, covering data preparation, annotation, human review, preference data, evaluation, alignment, governance and the continuous improvement of AI systems.

A shift now recognized by the market

Gartner predicts that by 2027 organizations will use small, task specific AI models at least three times more often than general purpose large language models.

The reasons include greater contextual precision, faster responses, lower computational costs and better adaptation to proprietary enterprise data.

Read the Gartner prediction

Structured knowledge

Ontologies as the architecture of institutional knowledge

Organizations accumulate large volumes of documents, databases, manuals, case files, terminology and tacit knowledge. Without a structure connecting these elements, information remains fragmented and difficult for an intelligent system to use coherently.

Ontologies provide that conceptual map. They define the entities relevant to an organization, the relationships between them, their categories, properties, hierarchies, permissions and interpretation rules.

Concepts and terminology

Precise representation of legal, technical, administrative and sector specific terminology.

Relationships and hierarchies

Connections between entities, documents, procedures, responsibilities and levels of authority.

Permissions and context

Rules governing which information can be used, who can access it and under what circumstances.

When this structure is combined with knowledge retrieval, specialized models and properly prepared data, AI can operate on a much more faithful representation of institutional reality. Sovereignty then acquires an intellectual dimension because the organization retains control over how its own knowledge is represented, classified and used.

The AI supply chain

“Tokens Are the New Coal”: The Fuel of the Generative Economy

The expression “tokens are the new coal” presents tokens as the consumable unit of the generative economy. The comparison is deliberately provocative: both fuel productive infrastructure, depend on a supply chain and concentrate value around those who control extraction, transformation and distribution.

Behind every useful token lies prior material made up of documents, conversations, images, recordings, annotations, human decisions and expert knowledge.

Data must be collected or licensed, cleaned, normalized, classified, annotated, anonymized, evaluated and connected to a specific task. Volume remains useful, although provenance, representativeness and quality largely determine the model’s subsequent behavior.

This supply chain places datasets for AI and data operations at the center of technological autonomy.

Data for AI

Preparation and transformation

Collection, licensing, cleaning, normalization, structuring and preparation of data for models and AI systems.

Explore data for AI →

Datasets

Training and evaluation assets

Multilingual data across text, speech, audio, image, video, OCR, evaluation and domain specific knowledge.

Explore datasets →

AI Data Operations

Evaluation, alignment and improvement

Annotation, human oversight, preference data, evaluation, alignment and improvement cycles for production AI systems.

Explore AI Data Operations →

History and evolution

From multilingual data to the alignment of European AI models

Pangeanic began its journey by collecting, aligning and processing data for machine translation systems. That experience produced large linguistic repositories and an industrial capacity to work with multilingual data at scale.

Over time, this work expanded into datasets covering text, speech, audio, image, video, documents, instructions, evaluations and human preference signals. The technical continuity remains clear: turning dispersed information into reliable material for training, adapting, evaluating and governing intelligent systems.

Collaboration with Barcelona Supercomputing Center

Pangeanic has collaborated with Barcelona Supercomputing Center on AI data, annotation, evaluation, RLHF and the alignment of European language models including Salamandra and ALIA.

The collaboration illustrates the shift from corpus collection toward a more demanding task: defining how a model should behave across different languages, domains and situations.

View the BSC use case

Multilingual model alignment

Alignment requires instructions, examples, human preferences, evaluations, error taxonomies and expert review.

It also requires proper representation of languages and cultural contexts that remain underrepresented in large general purpose datasets.

Explore model alignment

Technology transfer

From SAVIA to enterprise deployment

The SAVIA project connects research, linguistic knowledge and technology transfer, showing how collaboration between universities and companies can move scientific advances toward practical tools and processes.

VSCF 2025: the future of small models

Manuel Herranz’s previous presentation anticipated the growing importance of smaller specialized models for enterprise applications.

SAVIA: research and technology transfer

Collaboration with ValgrAI and Universitat Jaume I transferred the discussion of models and knowledge into applied research.

Technological autonomy

Sovereignty as operational capacity

The physical location of servers forms part of sovereignty, although the concept extends across a much broader operational chain.

An organization gains greater autonomy when it can decide which data it uses, which knowledge it incorporates, which models it selects, how results are evaluated and under what conditions each component is deployed.

Capabilities supporting sovereign AI

Control data provenance and rights of use.
Decide which knowledge enters the system.
Select the right model for each task.
Apply internal evaluation and alignment criteria.
Protect personal, sensitive and confidential information.
Deploy components within private infrastructure.
Supervise behavior through human oversight.
Replace models or providers without losing institutional knowledge.

Pangeanic brings these capabilities together within its sovereign AI systems offering, combining domain specific data, specialized models, language technology, evaluation and controlled deployment options.

The ECO Intelligence Platform adds an orchestration layer for translation, knowledge retrieval, anonymization, quality evaluation, APIs and human review within governed enterprise workflows.

VSCF 2026

A European conversation about control and specialization

Manuel Herranz will bring a clear proposition to the forum: the next stage of enterprise AI will depend less on accumulating generative capacity and more on organizing knowledge, preparing data and deploying models suited to specific tasks.

Small models provide efficiency and specialization. Ontologies provide structure. Data determines what the system can learn. Evaluation and human oversight make its behavior governable.

The combination of these layers provides a more mature basis for technological sovereignty, particularly for European enterprises, public administrations and regulated sectors.

Date Thursday, July 2, 2026

Location Main auditorium, Universitat Politècnica de València

Presentation Sovereign AI, ontologies, small models and tokens

AI built on your own knowledge

Build an AI strategy around your own data, models and institutional knowledge

Pangeanic helps enterprises, institutions and public administrations prepare data, adapt specialized models, structure knowledge, evaluate systems and deploy multilingual AI workflows under conditions of greater control.

Discuss an AI project Explore sovereign AI Explore AI Data Operations

View full post