Strengthening research TDM for the AI-era

This policy paper analyses the practical functioning of Article 3 of the CDSM Directive, which grants research organisations (ROs) and cultural heritage institutions (CHIs) a mandatory exception permitting text-and-data mining (TDM) for scientific research. Using doctrinal analysis and qualitative interviews with ROs, CHIs, publishers, and experts, the study evaluates how the exception operates in practice and identifies barriers relevant to the Directive’s 2026 revision.

Despite Article 3’s goal of facilitating innovation, key concepts remain unclear — a above all, the meaning of “lawful access.” Stakeholders report uncertainty regarding whether subscription-based access, freely available online content, or licensed digital collections meet this requirement. This ambiguity, coupled with inconsistent national implementation, weakens the exception’s harmonising purpose and discourages researchers from undertaking TDM.

Interviews reveal broad support for the research exception but highlight systemic obstacles. Contractual practices, often shaped by unequal bargaining power, frequently restrict or indirectly impede TDM despite Article 7’s prohibition on contractual override. These include explicit TDM bans, technical access limitations, and requirements to delete datasets needed for research reproducibility. A general lack of understanding of the legal framework among both beneficiaries and rightholders compounds these barriers.

The rise of generative AI intensifies tensions: publishers fear Article 3 enables unremunerated use of copyrighted works for AI training. The paper finds these concerns largely misplaced, as Article 3 applies only to non-commercial scientific research by narrowly defined beneficiaries and does not permit downstream commercial exploitation. Uses relevant to commercial AI development instead fall under Article 4 or general copyright rules. Misconceptions nonetheless influence restrictive licensing behaviour.

The study concludes that Article 3’s limitations stem primarily from legal uncertainty and inconsistent practice rather than from the exception’s substantive design. To ensure effective, legally secure research TDM, the paper recommends: (1) clarifying the definition of lawful access; (2) strengthening enforcement mechanisms against contractual circumvention; (3) issuing guidance clarifying the boundary between Articles 3 and 4, especially for AI-related research; and (4) establishing transparent, ongoing consultation structures involving ROs, CHIs, researchers, and technical experts.

These reforms would enhance legal certainty, support responsible AI-era research, and reinforce the Directive’s role in enabling scientific progress across the EU. You can find the paper here.