Artificial intelligence: instructions from the Privacy Guarantor to protect personal data against web scraping


The Privacy Guarantor has published guidance to protect personal data published online by public and private entities as data controllers against web scraping, the indiscriminate collection of personal data on the Internet, carried out by third parties, for the purpose of training generative artificial intelligence (IAG) models. The document takes into account the contributions received by the Authority in the context of the fact-finding process approved last December.

Pending a ruling, following the outcome of a number of investigations already initiated, including that against OpenAI, on the legality of web scraping of personal data based on legitimate interest, the Authority deemed it necessary to ban those who publish personal data online , to provide data controllers with some initial indications on the need to carry out some assessments regarding the need to take appropriate measures to prevent or at least hinder web scraping.

In the document, the Authority proposes some of the concrete measures to be taken: the creation of reserved areas, accessible only after registration, to remove data from public availability; the inclusion of anti-scraping clauses in the sites’ terms of service; monitoring traffic to web pages to identify any abnormal flows of incoming and outgoing data; specific interventions on bots, including using the technological solutions made available by the same companies responsible for web scraping (for example, the intervention on the robot.txt. file).

These are non-mandatory measures that data controllers will have to assess, based on the principle of responsibility, whether to implement them to selectively prevent or mitigate the effects of web scraping, taking into account a series of elements: the state of the art technology; implementation costs, especially for SMEs.

The provision, available on the website www.gpdp.itis published in the Official Journal.

Source: Privacy Garant



Source link

Leave a Comment