EU Commission Releases Template for Mandatory AI Training Data Disclosure Under AI Act

The European Union has mandated that providers of Artificial Intelligence (AI) models disclose detailed summaries of their training data, with the European Commission recently unveiling a new template to facilitate compliance. This move, part of the comprehensive EU AI Act, aims to enhance transparency in the rapidly evolving AI landscape. The decision, as noted by HackerNoon, has "spark[ed] debate on bureaucracy, company secrets, and enforcement."

The EU AI Act, which officially entered into force on August 1, 2024, stands as the world's first legal framework regulating AI, employing a risk-based approach. It introduces specific transparency obligations for General-Purpose AI (GPAI) models, which are capable of performing a wide range of tasks and serve as the foundation for many AI systems. The Act seeks to ensure that AI systems deployed within the EU are safe, transparent, traceable, non-discriminatory, and environmentally friendly.

Under Article 53(1)(d) of the Act, providers of GPAI models are now required to make publicly available a "sufficiently detailed summary" of the data utilized for training their models. This includes listing the main data collections or sets that contributed to the training process, encompassing both large private and public databases. The new template, presented by the Commission on July 24, 2025, is designed to guide providers through the entire data lifecycle, from pre-training to fine-tuning.

This new mandate has ignited considerable discussion, particularly concerning the delicate balance between fostering transparency and safeguarding proprietary company trade secrets. Critics raise concerns about potential bureaucratic hurdles and challenges in effective enforcement, while proponents underscore the critical need for accountability and the ability to scrutinize how training data might influence model behavior. The EU's AI Office aims to finalize the template promptly, acknowledging the complex task of balancing diverse stakeholder interests.

This regulatory step signifies a pivotal shift towards greater accountability for AI developers and providers operating within the EU market. While high-risk AI systems have a longer compliance timeline, the transparency requirements for GPAI models are becoming applicable sooner. The initiative is expected to fundamentally reshape how AI models are developed and deployed, ultimately fostering a more transparent and trustworthy AI ecosystem across Europe.