An overview on the EU Proposal for an Artificial Intelligence Act

Marti Sec

12 May 2021 — 9 min read

Recently the European Commission released their proposal for a regulation laying down harmonized rules on Artificial Intelligence. After a review, I've tried to summarize its contents and explain what is considered an AI system, its classification, measures and finally give my opinion on the proposal.

Fist we need to start with a definition:

‘artificial intelligence system’ (AI system) means software that is developed with one or more of the techniques and approaches listed in Annex I and can, for a given set of human-defined objectives, generate outputs such as content, predictions, recommendations, or decisions influencing the environments they interact with;
– Title 1 Article 3

This includes not only machine learning but other more "traditional" techniques.

Machine learning approaches, including supervised, unsupervised and reinforcement learning, using a wide variety of methods including deep learning
Logic- and knowledge-based approaches, including knowledge representation, inductive (logic) programming, knowledge bases, inference and deductive engines, (symbolic) reasoning and expert systems
Statistical approaches, Bayesian estimation, search and optimization methods

Regulated AI practices

The regulation divides AI systems depending on its purpose/practices in three categories:

Prohibited
High Risk
Others

Prohibited

Title II Article 5 specifies in it's first paragraph the list of forbidden artificial intelligence practices.

The first two will depend if they can "materially distort a person’s behaviour in a manner that causes or is likely to cause that person or another person physical or psychological harm", either if the system deploys subliminal techniques beyond a person's consciousness or exploiting any of of the vulnerabilities of a specific group of persons. This is regulated for all players in the EU market.

On the other hand, the prohibition on the last two practices is limited to public authorities and law enforcement:

Social scoring systems that can result in a detrimental or unfavorable treatment of persons or groups
"Real-time" remote biometric identification systems in publicly accessible spaces unless there is "a prior authorization granted by a judicial authority or by an independent administrative authority"

High-risk AI systems

Article 6,Chapter I of Title III specifies that independent AI products and AI systems intended to be used as a safety component of a product in the areas included in Annex III of the regulation are considered high risk AI systems. In paragraph 24 of the proposal limits its scope to those that have "a significant harmful impact on the health, safety and fundamental rights of persons in the Union".

The list, that can be updated in the future, covers the following areas:

Biometric identification and categorization of natural persons
Management and operation of critical infrastructure: road traffic, water, gas, heating and electricity
Education and vocational training: determining access and assessing students
Employment, workers management and access to self-employment: recruitment and making decisions on promotion and termination of employees
Access to and enjoyment of essential private services and public services and benefits: public assistance benefits, credit score (except for small providers) and dispatching of emergency first dispatch services
Law enforcement: used profiling persons and its risk and pattern detection in crime analytics
Migration, asylum and border control management: verification of authenticity of travel documents, applications of asylum, visa, or residence and risk assessment of individuals
Administration of justice and democratic processes: assistance to a judicial authority in researching and interpreting facts and the law and in applying the law to a concrete set of facts

Keep in mind that this regulation considers biometric data as:

[...] personal data resulting from specific technical processing relating to the physical, physiological or behavioural characteristics of a natural person, which allow or confirm the unique identification of that natural person, such as facial images or dactyloscopic data.

This is a broad definition that could also include browsing history (1), voice profiles, tattoos, manner of walking (2) and also data obtained through smartwatches and health apps.

Measures for high risk AI systems

One of the first requirements stated in Title III Chapter 2 Article 9 is the need to have a risk management system in place to contain updated information about the known and foreseeable risks the system might have under normal use and also misuse and the mitigations implemented and proposed to eliminate or reduce them. This same article also puts emphasis into the testing procedures.

On the other hand, Article 10 focuses on the importance of data and its governance: from design choices of features to the examination of possible biases. One of the key aspects is that training, validation and testing data shall be relevant and representative as regards the persons or groups on which the system is intended to be used and shall take into account the characteristics particular to those. For the specific purposes of bias monitoring the providers of those AI systems may process special categories of personal data.

The following articles address the need for traceability and logging capabilities (although the details are quite vague), interpretability of its output and instructions of use. Special attention is needed for cybersecurity and monitoring of feedback loops to avoid practices like (training) data poisoning.

Finally Chapter 2 addresses the need of human oversight for all High Risk AI systems to prevent and minimize the impacts of the risks. The interface tools shall allow individuals to:

Monitor its operations so that signs of anomalies can be detected and addressed as soon as possible
Remain aware to the tendency of over-relying on the output produced by a high-risk AI system ('automation bias')
Correctly interpret its output
Decide not to use the AI system and be able to override or reverse its output
Interrupt the operation of the system through a "stop button"

Additionally, for AI systems used to identify natural persons (biometrics), two individuals must be involved in taking any action or decision on the basis of the identification resulting from the system. So the system can be used as a help for the identification but any decision needs to be made by two separate persons that take the responsibility for it.

To comply with chapter 2 requirements, harmonized standards shall be published in the Official Journal of the European Union but they are not in place yet as far as I have seen. Part of those standards will be technical documentation containing at least (Annex IV):

the methods and steps performed for the development
Design specifications of the system, key design choices and assumptions made
Description of the system's architecture and overall integration with other systems
Provenance and characteristics of the data sets, and how they sere obtained, selected, labeled and cleaned
Oversight measures needed
Pre-determined routine changes
Validation and testing procedures used, including the metrics used to measure accuracy, robustness and cybersecurity
Detailed information about the monitoring, functioning and control of the AI system
A description of any change made to the system through its life cycle
Post-market system and plan to monitor and evaluate performance in this phase (Title VIII Article 61)

To not hinder the development of innovative AI systems the proposal establishes AI regulatory sandboxes (Title V) used to facilitate the development, testing and validation for a limited time. All systems using this will be placed under the direct supervision and guidance by the competent authorities to ensure compliance with the requirements of the regulation since it allows for further processing of personal data for developing certain AI systems in the public interest.

Approved systems will obtain the CE marking and they will be published in a public EU database for stand-alone high risk AI systems (Title VII article 60) containing information such as details of the providers, description of its purpose, status information and electronic instructions for use (see Annex VIII).

Finally, it expresses the need to process special categories of personal data in order to enhance the capability to monitor, detect and correct bias in AI systems. Although it might seem counter intuitive, researchers and companies have seen the need to have and use sensitive data related to gender, ethnics and other sensitive categories in order to uncover proxy variables, problems in datasets, algorithms, assumptions and all the others forms of algorithmic bias.

Fines for not complying with those requirements can sum up to 30 000 000 EUR or 6% of its total worldwide annual turnover.

All the other algorithms

For all the other AI systems not prohibited or classified as High Risk, the aforementioned rules do not apply although the regulation encourages to create codes of conduct intended to voluntarily establish the requirements set-out in Title III Chapter 2 as standard within companies.

The only exception are certain AI systems like deep fakes, chat bots and emotion recognition systems regulated in Title IV Article 52 with extra measures related to informing natural persons that they are interacting with an AI system.

Opinion

While the narrow scope of forbidden AI systems might make sense in a traditional structure in which states are the most powerful entities, it falls short in the new era in which big corporations are de-facto more powerful and years ahead of traditional governing bodies. It's notorious the, not only, monopolistic nature of those companies but also their tactics when looking for financial incentives when deciding on a place to install their factories or warehouses, and how they play with the states to see which one offers them a more beneficial deal.

As sociologist Shoshana Zuboff expresses in her book "The Age of Surveillance Capitalism", those are the companies who conquer the rights that were previously ours and declare that our experience is now their possession. And as part of an always-expansive strategy, implement scoring of individuals and real time biometric identification that can be deployed in public space, for example a Ring camera installed by an individual in front of their door which is facing the street, in their car as a dash camera or even inside a grocery store. It's those purposes that are already happening and companies want to make seamless to us (as part of the IoT) the ones that can affect our lives in the future and are in need to be regulated (5).

It's irrelevant if this regulation forbids the usage of real time biometrics surveillance on publicly accessible spaces to states and their bodies if then private companies sell them the data that they gathered from our public homes that is even more sensible, abusive and useful to them. So this limitation does not seem to be even useful to limit and avoid damage made to individuals by the states, since they have an even pore powerful, uncontrolled and unregulated tool.

Focusing on one particular identification method misconstrues the nature of the surveillance society we’re in the process of building. Ubiquitous mass surveillance is increasingly the norm. In countries like China, a surveillance infrastructure is being built by the government for social control. In countries like the United States, it’s being built by corporations in order to influence our buying behavior, and is incidentally used by the government.
– Bruce Schneier, (5)

It's unknown what happens to algorithms that are part of a broader system, for example a heart attack detector algorithm shipped with a smartwatch. Is it a safety component of the device? Or is it independent from it? It seems to me the answers to those questions is no. Thus, it might not be regulated under this proposal and just stay in an alegal limbo.

Furthermore of the criteria used when evaluating if an AI system is classified as high risk or not takes into account the feasibility for an individual to opt-out from the outcome of the system. While not a lot can be done about those systems where you can't opt-out for legal reasons, it is worrisome the inclusion "practical reasons" in the sentence. With this, the narrative is set in which AI systems are inevitable and you are always in. The only right you may have is to opt-out of it, although if companies state that it's too difficult, then you loose that right. This paves the way to the conquest pattern: they will do just cosmetic adjustments to their systems, but fundamentally it changes nothing.

In a human-centric view, and following the example of GDPR, humans should always be presented with the possibility to opt-in rather than opt-out. With the experience Apple has been recently taking with its privacy notifications, we have seen that when given the choice to opt-in and a simple explanation, users tend to not accept most of the things some companies want us to take for granted. For example, less than 11% of Facebook and Instagram users have accepted the company to track their behaviors when using iOS according to several sources (3)(4).

It should be addressed the voluntary character of measures for non high-risk AI systems. Although the proposal lays down the reasoning behind that decision, the fact that the classification of a system is done via a self-assessment by the provider, it may result in the fact that algorithms used within an organization and not sold independently, despite having big impacts to individual or sensitive groups, would not be registered at all, with the consequences it brings to lack of transparency. Furthermore, this would mean that many companies would relegate including the measures laid down in the regulation which I consider beneficial and helpful for a fair and thoughtful development of AI systems.

Finally I need to mention the lack of a general appealing process for a decision made by an AI system. For example, those censuring online platforms content like Twitch or YouTube that are, de-facto, the employer for many content creators to which they depend on for a living.