Tracking AI biases: an issue of robustness

Original blog post by Yannis Tevissen, PhD Student & AI Researcher at Moments Lab (ex Newsbridge)

‍

Some of you have probably already used, or perhaps use on a daily basis, applications whose algorithms work thanks to artificial intelligence, in particular thanks to machine and deep learning. In the field of voice recognition we can mention the very famous Siri, Alexa or Google Assistant.

‍

‍

But if you use these voice assistants on a regular basis, you have probably realized their limitations. For example, if you try to talk to them in a car or in a noisy room, they are often inaccurate or even completely off the mark…

From an algorithmic point of view, this is called a robustness problem.

Many speech recognition algorithms are unfortunately, and despite many years of research, still not very robust to noise or to the phenomenon of overlapping speech (when two or more people speak simultaneously).

However, these robustness problems are not exclusive to speech recognition but are common to many algorithms, especially those based on statistical approaches.

‍

‍

Depending on the target application and the algorithm used, these lack of robustness can be explained in several ways (non-exhaustive list):

‍

The lack of training data specifically chosen for the algorithm to perform well in a particular context.

Indeed, when one wishes to train a neural network, it is necessary to choose a balanced dataset that reflects all the cases that one wishes to be able to treat. An infamous example is that of the algorithms of the American giants Facebook and Google, which were both criticized for confusing people with black skin and monkeys. Although no detailed technical explanation was given, it is easy to imagine that the neural networks of the two firms had not been sufficiently trained on photographs of black people.

‍

The lack of pre-processing before applying the statistical approach.

For example, one can try to de-noise an audio signal that has been altered by noise. Beware, however, that some of these methods degrade the useful information and affect the performance of the next steps in the processing chain. This is the case with speech recognition. Applying too much pre-processing can seriously degrade recognition performance.

‍

Using too few or irrelevant parameters to obtain the desired result.

In this case, new modalities can sometimes be investigated to refine the analysis and make it more robust. For instance, to analyse the video of a conversation, in the context of a broadcasted debate, we can study the audio but also the image by looking at the movement of the lips of the various speakers.

To overcome the problems of robustness, it is often a question of identifying the scenarios in which the performance drops and then adopting the right algorithmic strategy to correct them.

In fact, the notion of robustness is closely linked to the one of algorithmic bias, the importance of which is well known. Making an algorithm robust means understanding and removing some of its biases.

‍

“Explainability and robustness are largely intertwined: understanding the mechanisms of a system is a standard approach to guarantee its reliability.”

‍Extract from a report by the JRC (Joint Research Center) of the European Union.

‍

In general, biases are being tracked on the results obtained with a test set but often only at the end of the analysis pipeline. A good practice is to track down biases throughout the entire algorithmic architecture, from the construction of the database to the inference and training of the neural networks.

Finally, when we talk about the robustness of neural networks, we also think of robustness to so-called “adversarial attacks”, which consist of deceiving a neural network by presenting it with very slightly altered information in such a way that it is interpreted as the attacker wishes. By knowing or inferring certain characteristics of the model used, the attacker can determine a combination that deceives the statistical rules on which the model is based.

In the case of an image classification network, this would mean, for example, adding a few carefully chosen pixels that are invisible to the naked eye so that the network detects a flowerpot in an image containing a traffic sign. These attacks are very powerful and can cause serious problems when used to fool biometric systems or the driving systems in autonomous vehicles.

‍

Example of an adversarial perturbation altering a computer vision algorithm

‍

In a 2020 publication, the European Union stresses the need to make machine learning-based algorithms more explainable and robust, particularly with regard to the protection of personal data and the security of autonomous systems.

To achieve this objective, it recommends, among other things, the development of standardized tests and tools to assess the robustness of artificial intelligence algorithms.

Aurélie Jean, a French doctor expert in digital sciences and member of the Société informatique de France, suggests in her book “Les Algorithmes font-ils la loi ?” that we should go even further and impose, following the example of the GDPR on the protection of personal data, a legal obligation of testing and explainability for certain algorithm manufacturers.

As a conclusion we can say that tackling AI robustness and explainability issues is rapidly becoming a necessity among trustworthy AI providers to achieve a certain standard of quality and performance. Building unbiased AI pipelines often requires a well-trained and dedicated team that understands the fundamentals of machine learning to track the biases down the algorithm.

‍