Skip to content

Data Governance for AI

The AI Act's section on 'Data and data governance' requires specific provisions for managing training data, validation data, and test data in AI systems. This concept is distinct from general data protection and deserves its own topic to capture AI-specific data governance requirements including data quality, documentation, and management practices.

data governance data management data quality data provenance data lineage data documentation data standards data policies

Overview

Legal Framework

The AI Act's data governance obligations are established in Recitals 67 and 69. These recitals frame the legal principles, with detailed requirements expected in the corresponding operative articles of the Act. Recital 67 mandates that high-risk AI systems, particularly those using training models, must be developed using high-quality data sets for training, validation, and testing. This requires providers to implement appropriate data governance and management practices. The objective is to ensure the system performs safely and as intended, and to prevent outcomes that constitute discrimination prohibited by Union law. Recital 69 explicitly integrates data protection principles, stating that the right to privacy and data protection must be guaranteed throughout the AI system's lifecycle. It confirms the applicability of data minimisation and data protection by design and by default from the GDPR when personal data is processed.

Practical Application

The AI Act's data governance concept is distinct from general data protection; it focuses specifically on the quality and integrity of data used to build and evaluate AI models. As indicated by the commentary from Tekst & Commentaar on the GDPR, the protection of fundamental rights like privacy is a foundational principle. The Peter Puškár case reinforces that all processing must comply with core data quality principles. For AI governance, this means organizations must implement technical and organizational measures to ensure training, validation, and test data sets are relevant, representative, and free of errors that could lead to biased or unsafe outcomes. Recital 69 suggests measures like anonymisation, encryption, and algorithmic transparency tools can be part of a compliant governance framework, aligning with the GDPR's requirement for a lawful basis and a balancing of interests as seen in the Google Spain ruling.

Key Considerations

  • Establish AI-Specific Data Protocols: Implement documented procedures for data collection, labelling, cleaning, and segmentation specifically for AI development datasets (training, validation, testing), going beyond general data protection compliance.
  • Integrate Data Protection by Design: From the initial design phase, apply data minimisation (e.g., using synthetic data or anonymised datasets where possible) and build in technical measures to preserve privacy throughout the AI lifecycle, as mandated by the interplay of the AI Act and GDPR.
  • Document Data Provenance and Quality: Maintain records demonstrating the relevance, representativeness, and quality of your datasets to evidence compliance with the AI Act's requirement for "high-quality data" and to support any required conformity assessments.

Laws (10)

Case Law (6)

GC and Others v CNIL

C-136/17 (GC and Others)

Conditions for delisting sensitive data from search results.

Peter Nowak v Data Protection Commissioner

C-434/16 (Nowak)

Examination scripts constitute personal data of the candidate.

Peter Puškár v Finančné riaditeľstvo Slovenskej republiky and Kriminálny úrad finančnej správy

Puškár

Lawful basis (in general): Subject to the exceptions permitted under Article 13 of the Data Protection Directive, all processing of personal data must comply, first, with the principles relating to data quality (in Article 6 of that directive) and, have lawful basis (by complying with one criteria for making data processing legitimate listed in Article 7 of that directive) (see, Bara). The list of lawful basis in Article 7 is an exhaustive and restrictive list of cases in which the processing of

GOOGLE SPAIN SL V. AEPD (THE DPA) & MARIO COSTEJA GONZALEZ, 13.May.2014 (“GOOGLE v. Spain”)

Google Spain

Legitimate interest balancing test: Legitimate interest requires balancing of the interest of the controller and third party with the interest of the data subject. In this particular case, having regard to the sensitivity for data subject’s private life of information contained in announcements and the fact that the initial publication occurred 16 years earlier, the data subject has established that the links should be removed. (¶¶ 70–75, 80-81, 98)

Google Spain SL and Google Inc. v AEPD and Mario Costeja González

C-131/12 (Google Spain)

Established the right to be forgotten (delisting). Search engines are data controllers.

Digital Rights Ireland Ltd v Minister for Communications

C-293/12 (Digital Rights Ireland)

Invalidated Data Retention Directive as incompatible with fundamental rights.

Guidance (10)

Richtsnoeren 05/2022 voor het gebruik van gezichtsherkenningstechnologie in het kader van rechtshandhaving

guidelines gebruik gezichtsherkenning bij rechtshandhaving

Steeds meer rechtshandhavingsinstanties passen gezichtsherkenningstechnologie toe of zijn voornemens deze toe te passen. De technologie kan worden gebruikt om een persoon te authenticeren of te identificeren en kan voor video's (bijv. CCTV) of foto's worden ingezet, maar ook voor andere doeleinden, waaronder het opzoeken van personen op signaleringslijsten van de politie of het volgen van de bewegingen van een persoon in de openbare ruimte. Gezichtsherkenningstechnologie is gebaseer...

Richtsnoeren 3/2022 betreffende het herkennen en vermijden van misleidende ontwerppatronen in de interfaces van socialemediaplatforms

guidelines misleidende ontwerppatronen

Deze richtsnoeren bieden praktische aanbevelingen aan aanbieders van sociale media als verwerkingsverantwoordelijken van sociale media, ontwerpers en gebruikers van socialemediaplatforms, over het beoordelen en vermijden van zogenaamde 'misleidende ontwerp patronen' in de interfaces van sociale media die inbreuk maken op de vereisten van de AVG. Daartoe beveelt de EDPB aan dat verwerkingsverantwoordelijken gebruikmaken van interdisciplinaire teams, bestaande uit onder meer ontwerpers, func...

Guidelines 05/2020 on consent under Regulation 2016/679

Guidelines on consent

Guidelines 03/2022 on Deceptive design patterns in social media platform interfaces: how to recognise and avoid them

Guidelines on deceptive design patterns in social media platform interfaces: how to recognise and avoid them

These Guidelines offer practical recommendations to social media providers as controllers of social media, designers and users of social media platforms on how to assess and avoid so-called 'deceptive design patterns' in social media interfaces that infringe on GDPR requirements. To this end, the EDPB recommends that controllers make use of interdisciplinary teams, consisting, among others, of designers, data protection officers and decision-makers. It is important to note ...

Guidelines 8/2020 on the targeting of social media users

Guidelines on the targeting of social media users

Guidelines 05/2022 on the use of facial recognition technology in the area of law enforcement

Guidelines on the use of facial recognition technology in the area of law enforcement

More and more law enforcement authorities (LEAs) apply or intend to apply facial recognition technology (FRT). It may be used to authenticate or to identify a person and can be applied on videos (e.g. CCTV) or photographs. It may be used for various purposes, including to search for persons in police watch lists or to monitor a person's movements in the public space. FRT is built on the processing of biometric data , therefore, it encompasses the processing of special categories ...

Richtsnoeren 3/2019 inzake de verwerking van persoonsgegevens door middel van videoapparatuur

guidelines cameratoezicht

Richtsnoeren 07/2020 over de begrippen 'verwerkingsverantwoordelijke' en 'verwerker' in de AVG

guidelines over de begrippen 'verwerkingsverantwoordelijke' en 'verwerker' in de AVG

De begrippen 'verwerkingsverantwoordelijke', 'gezamenlijke verwerkingsverantwoordelijke' en 'verwerker' spelen een cruciale rol bij de toepassing van de algemene verordening gegevensbescherming (AVG, Verordening (EU) 2016/679), aangezien ermee wordt bepaald wie verantwoordelijk is voor de naleving van verschillende gegevensbeschermingsregels en op welke wijze betrokkenen hun rechten in de praktijk kunnen uitoefenen. De precieze betekenis van deze begrippen en de criteria voor de jui...

Richtsnoeren 8/2020 betreffende de targeting van gebruikers van sociale media

guidelines targeting gebruikers sociale media

Richtsnoeren 01/2021

Enforcement (2)

GENERALI ESPAÑA, SOCIEDAD ANONIMA DE SEGUROS Y REASEGUROS: Insufficient technical and organisational measures to ensure information security

€4,000,000 fine - Spanish Data Protection Authority (aepd)

The Spanish DPA has imposed a fine on GENERALI ESPAÑA, SOCIEDAD ANONIMA DE SEGUROS Y REASEGUROS. The controller had suffered a data breach where unknown third parties gained access to the customer data management system using credentials of a broker which allowed them to access customer data such as name, IBAN, personal identification number. The incident affected approximately 1.5 million individuals. During its investigation, the DPA found, in particular, that the controller had failed to impl

Digi Távközlési Szolgáltató Kft. ('Digi') (electronic communication service provider): Insufficient technical and organisational measures to ensure information security

€288,000 fine - Hungarian National Authority for Data Protection and the Freedom of Information (NAIH)

The company had infringed the principles of purpose limitation and storage restriction because its database contained a large amount of customer data which were no longer relevant for the actual purpose of collection and for which no retention period had been set. Furthermore, the NAIH pointed out that the defendant had not taken proportionate measures to reduce the risks in the area of data management and data security, arguing, inter alia, that it had not used encryption mechanisms.

News (9)

Is the AI Act caging ChatGPT and other General Purpose Artificial Intelligence systems?

> The growth of generative artificial intelligence systems has led EU lawmakers to focus on General Purpose AI in drafting the AI Act, which will set the framework governing artificial intelligence in the European Union. As previously reported, the EU Parliament has already broadened the definition of artificial intelligence for the purposes of the AI Act… The post Is the AI Act caging ChatGPT and other General Purpose Artificial Intelligence systems? appeared first on GamingTechLaw.

Hunton summarises two articles from the new SCCs: the 'local laws and government access' section

Under Clause 14 of the Data Transfer SCCs, the data importer must carry out a transfer risk assessment to verify whether the laws and practices of the receiving third country could prevent the data importer from complying with the Data Transfer SCCs. If the risk assessment shows that the Data Transfer SCCs alone will not ensure an essentially equivalent level of protection for the personal data in the receiving third country, supplementary safeguards will need to be implemented, such as end-to-e

Hunton geeft een samenvatting van twee artikelen uit de nieuwe SCC-richtlijnen: het onderdeel over "lokale wetgeving en toegang tot overheidsinstanties".

Volgens artikel 14 van de Standaard Contractuele Bepalingen (SCC's) voor gegevensuitwisseling, moet de partij die de gegevens importeert een risicoanalyse uitvoeren om te verifiëren of de wet- en regelgeving en praktijken van het ontvangende derde land de mogelijkheid van de gegevensimporteur om te voldoen aan de SCC's voor gegevensuitwisseling, kunnen belemmeren. Indien de risicoanalyse aantoont dat de SCC's voor gegevensuitwisseling op zichzelf niet voldoende zijn om een in wezen gelijkwaardig beschermingsniveau te garanderen voor de persoonsgegevens in het ontvangende derde land, moeten aanvullende waarborgen worden geïmplementeerd, zoals end-to-end-versleuteling.

Self-Sovereignty for Refugees? The Contested Horizons of Digital Identity

Self-sovereign identity is an embryonic technology with uncertain benefits for refugees. It may help to empower them, but it also risks reinforcing the power of states and corporations over them, according to this article.

Zelfbeschikking voor vluchtelingen? De controversiële mogelijkheden van digitale identiteit.

Zelfvoorzienende identiteit is een technologie in een vroeg stadium van ontwikkeling, waarvan de voordelen voor vluchtelingen nog onzeker zijn. Het kan hen mogelijk meer autonomie geven, maar het riskeert ook de macht van staten en bedrijven ten opzichte van hen te versterken, zo stelt dit artikel.

DLA Piper: Wie is wie onder de DMA, DSA, DGA en de Data Act?

Als onderdeel van haar datastrategie heeft de Europese Commissie een aantal wetgevende instrumenten gepresenteerd, waaronder de Digital Markets Act (DMA), de Digital Services Act (DSA), de Data Governance Act (DGA) en de Data Act. Ons artikel analyseert deze vier nieuwe instrumenten in meer detail.

DLA Piper: Who’s who under the DMA, DSA, DGA and Data Act?

> As part of its data strategy, the European Commission has presented a number of legislative instruments, including the Digital Markets Act (DMA), the Digital Services Act (DSA), the Data Governance Act (DGA) and the Data Act. Our article analysing these four new instruments in more detail

EU-wetgeving inzake datagovernance definitief vastgesteld

The new data governance regulation sets out the conditions for the reuse of certain government data. In addition, the regulation provides a notification and oversight framework for the provision of data mediation services. Furthermore, the regulation contains a framework for the voluntary registration of entities that collect and process data made available for altruistic purposes. The rules will apply from September 2023.

Data Protection Officer or Chief Privacy Officer?The rise of the Data Protection Officer

> Do we need an Chief Privacy Officer, a Data Protection Officer, or do we need both?In the following article, I will examine the benefits of both roles, but I will also look at some of the challenges related to each of the roles and why these have impelled both Data Protection Officers and organisations to question what the ideal setup is for them.