Select an area of expertise to find out more about our experience.
Find out more about our barristers and business support teams here.
The European Data Protection Board (‘EDPB’) coordinates the work of regulators investigating ChatGPT’s compliance with GDPR across Europe, via ‘ChatGPT Taskforce’ set up in April 2023. Most dramatically, the Italian regulator temporarily banned ChatGPT access based on breaches of GDPR, before reinstating access in mid-2023 after the company implemented changes to satisfy that regulator (such as a form enabling a data subject to request removal of their personal data). However, its investigation continued, concluding in early 2024 that “the available evidence pointed to the existence of GDPR breaches” in relation to mass collection of users’ data to train its algorithm and exposure by younger users to inappropriate content generated by the bot.
The EDPB’s Taskforce has now published its ‘preliminary views’ on certain aspects of the regulators’ continuing investigations. The report’s consideration of the thorny issue of the ‘legal basis’ for ‘scraping’ (mass collection of personal data) from the Internet is particularly important for anyone interested in the direction of travel for AI.
OpenAI (the company behind ChatGPT) put forward Article 6 (1) (f) GDPR (‘legitimate interest’) as a legal basis for scraping, which involves the balancing of the controller’s interest against the fundamental rights/freedoms of data subjects. The Taskforce opined that provision of ‘adequate safeguards’ may reduce the undue impact on data subjects, tipping the balance in the controller’s favour. In the context of ChatGPT, such safeguards could include “defining precise collection criteria and ensuring that certain data categories are not collected or that certain sources (such as public social media profiles) are excluded from data collection. Furthermore, measures should be in place to delete or anonymise personal data that has been collected via web scraping before the training stage. (par. 17)” The safeguards could involve automated filtering of ‘special category’ personal data, which are subject to a higher bar for ‘legal basis’. The Taskforce made clear that the mere fact that someone’s personal data is ‘publicly accessible’ does not imply that “the data subject has manifestly made such data public” (as required by Article 9 (2) (e), a basis for processing special category data).
The report has also effectively ruled out AI companies putting the responsibility to avoid misleading, discriminatory or unnecessarily detrimental AI bot output on the user data subject. A crucial aspect of the fairness duty is that there should be no “risk transfer” to the data subjects, such as via Terms and Conditions saying that users are responsible for their chat inputs (from which the bot will then learn and return results to other users).
ChatGPT and other AI bots dependant on ‘scraping’ are increasingly embedded in industry and life. It is not surprising that the EDPB is creating (albeit ‘preliminarily’) a route by which AI companies can operate within the GDPR framework, by encouraging them to implement greater safeguards, rather than simply declaring that there is no adequate legal basis for scraping at all. The EDPB will be mindful that the AI Act, while some way off being enforceable, will take some of the risk-management burden off GDPR’s shoulders in the EU (but, crucially, not in the UK). The Taskforce’s final analysis, once released, will not be binding on the UK regulator or courts, but will carry persuasive weight, given that UK GDPR mirrors the European data protection framework.
A monthly data protection bulletin from the barristers at 5 Essex Chambers
The Data Brief is edited by Francesca Whitelaw KC, Aaron Moss and John Goss, barristers at 5 Essex Chambers, with contributions from the whole information law, data protection and AI Team.