Knowledge

Interpretability in Machine Learning: what is it and why is it so important?

by Francesco Capuani, Data Scientist at Dataskills

Anyone reading this article may already be familiar with the terms “Artificial Intelligence” or “Machine Learning” as technologies are increasingly widespread in everyday life. For non-experts, this familiarity often stops at the awareness that these algorithms are able, not without a margin of error, to replicate the decision making and pattern identification ability of the human mind, without however always being able to explain the various processes and calculations that lead to a certain output. But what if I told you that even the top AI experts aren’t always able to explain how a model is able to make decisions?

In fact, Is can be perfectly aware of how a particular model based on Neural Network works, but if the latter is complex and structured enough, it is not obvious to be able to explain why the model prefers one output rather than another.

This topic, i.e. the interpretability of models, has recently been much discussed and is extremely relevant in the Artificial Intelligence community. There is in fact a class of models, the so-called “Black-box” models, whose internal decision-making process is not immediately interpretable.

WHAT IS MEANT BY INTERPRETABLE MODELS

It may seem absurd to state that the internal processes of a Machine Learning model cannot always be interpreted by an expert, considering that this model is a product of the human mind and that it is perfectly known what type of operations the algorithm performs inside it.

While building a model, data scientists have a high degree of control over this being able to choose the variables to insert, the calculation algorithms for the error, the number of nodes, etc. However, by analyzing the output range provided by some Machine Learning algorithms, it is possible to understand the motivation behind the concept of “black box”.

For example ChatGPT, the most popular intelligent chatbot of the moment, is based on a Deep Neural Network with the beauty of 175 billion parameters, a number that makes an interpretative analysis by a human being practically impossible.

It should perhaps be noted that the algorithm behind GPT-3.5 (the engine behind ChatGPT) is probably among the most complex neural networks in history, but even if the parameters were “only” 1 billion, the situation would not change.

In fact, a neural network works by coupling weights and biases to parameters (or nodes), passing them through an activation function to then determine an output and adjust these measures according to the performance of the algorithm on a training set.

As long as the number of nodes remains relatively low, it is possible to understand why the algorithm has chosen a certain output rather than another: but when the nodes become millions, if not even billions, and the neural network expands over tens and dozens of layers, at which point it becomes impossible for a human mind to interpret the myriad interactions and calculations that have taken place in the algorithm.

In other words, the basic question of this issue is: if it is impossible for us to determine certain rules, should we trust the evaluations of the machine?

The problem of “black-box” algorithms is extremely relevant for several reasons. There are cases where we can just know the input, the output, and blithely gloss over everything in between.

Very often in business (although this is not always the case) what matters is the result, and that it is as precise and coherent as possible, while the various intermediate decision-making steps can be ignored as superfluous.

If, on the other hand, we wanted to use complex algorithms in different fields such as policy making or medical research and diagnosis, the situation changes drastically.

In the first case, it has been noted that algorithms can make decisions with intrinsic racial or gender biases. This does not mean that an algorithm can be racist, but that it can confirm cognitive biases with a racial background existing in the underlying training data and potentially lead to discrimination. For example, an algorithm called COMPAS has been used in the United States for over a decade to calculate the probability of recidivism of criminals, or how likely it is that a certain crime will be repeated. It should be emphasized that in this case the final verdict is always taken by a judge and the model has only an advisory function. However, how this algorithm is used has been questionable to say the least. In fact, it was found that in the vast majority of cases, African Americans were twice as likely to be labeled as potential repeat offenders than whites. Considering that the COMPAS system uses a Black Blox algorithm, therefore difficult to interpret for a human being, and which influences decisions that have a strong impact on the lives of many people, it is quite evident that this is very problematic, especially from a future point of view . In fact, if in a not too distant future we wanted to use artificial intelligence also at the level of policy making, it becomes of fundamental importance to be able to interpret these black boxes in order to make them readable and therefore give further legitimacy to these algorithms.

MODEL INTERPRETATION TECHNIQUES

There are mainly two approaches to the issue of interpretability of Black Boxes: the first is the use of certain tools that can make them more readable, the second is the use of “White Box” algorithms. An example of the first approach is SHAP (SHapley Additive exPlanations), a theoretical method based on game theory which aims precisely to explain the output of a Machine Learning model. The first approach seems to be the most solid, even if sometimes the interpretability remains rather limited, while the second approach proposes to use simpler and easily interpretable algorithms, which however tend to be less performing in terms of accuracy.

There is no simple and clear solution to this problem. It can certainly be helpful to use black or white box algorithms in a more considered way depending on the type of problem and data that needs to be analysed. There are many situations in which a simple algorithm can lead to excellent results even if slightly inferior to those of an extremely complex neural network: in such a case, one should at least consider sacrificing part of the model’s accuracy in favor of a greater interpretability. There are instead situations where the use of a black box algorithm is a necessity, such as in Image Analysis, and in this case it would be good practice first of all to keep the number of variables, parameters, nodes and layers to a minimum, and secondly place to use explanatory algorithms (such as the aforementioned SHAP) to investigate the decision-making process of the model.

Whistleblowing

L’Istituto del “Whistleblowing” è riconosciuto come strumento fondamentale nell’emersione di illeciti; per il suo efficace operare è pero cruciale assicurare una protezione adeguata ed equilibrata ai segnalanti. In tale ottica, al fine di garantire che i soggetti segnalanti siano meglio protetto da ritorsioni e conseguenze negative, e incoraggiare l’utilizzo dello strumento, in Italia è stato approvato il D.Lgs. n.24 del 10 marzo 2023 a recepimento della Direttiva (UE) 2019/1937 riguardante la protezione delle persone che segnalano violazioni.

Il decreto persegue l’obiettivo di rafforzare la tutela giuridica delle persone che segnalano violazioni di disposizioni normative nazionali o europee, che ledono gli interessi e/o l’integrità dell’ente pubblico o privato di appartenenza, e di cui siano venute a conoscenza nello svolgimento dell’attività lavorativa.

Segnalazione

(*) Campi obbligatori