Explainable AI (XAI)
The Basic Idea
Think about how you interact with generative AI programs like ChatGPT—you give it input and thoughtful, relevant responses are produced based on its training.
But sometimes, how AI reaches its output is ambiguous. In certain cases, it’s actually impossible to understand (maybe even for those who designed the AI program). This is known as the “black box” tendency of machine learning. So, how can we trust AI-produced accurate results if we cannot see the reasoning behind it? Explainable AI, otherwise known as XAI, aims to create transparency in this opaque aspect of AI.
Image source: What is a Black Box Model? Definition, Uses and Examples, Investopedia (2024)
XAI is the method of making AI decision-making processes understandable and accessible to human users.1 Explainable AI involves a combination of choosing interpretable models and applying explanation techniques to complex models. Explanation techniques include decision trees, model visualization, feature importance, and rule extraction. These processes aim to make AI decisions transparent, understandable, and trustworthy for end-users.
Building trust and confidence in AI-powered decision-making is especially important because one of the key issues limiting AI's capabilities is the public's apprehension towards it.2,3 Not understanding how an output was determined creates suspicion and concern among users.
When people cannot understand the reasoning behind an AI system’s decision, they are more likely to doubt its accuracy and reliability. This lack of transparency can hinder the adoption of AI technologies in various fields, from healthcare to finance. By addressing these concerns through improved explainability, we can foster greater acceptance and utilization of AI, ultimately unlocking its full potential.
XAI can help to propel research forward for both developers and users. By making the decision-making process more transparent, AI designers can debug and improve the performance of AI models more efficiently. Additionally, with a clearer understanding of the reasoning behind an output, users can employ the results more effectively. Think about how a better understanding of how a drug works leads to it being prescribed in a way that yields the greatest benefits. This enhanced comprehension not only improves trust but also ensures that AI outputs are applied most beneficially.
Making transparent the black box of machine learning is viewed as an example of the Right to Explanation being put into practice.4 It refers to the legal and ethical requirement that individuals impacted by automated decision-making systems be provided with an explanation of how those decisions were made. This governing regulation is particularly relevant for countries that have strict data protection and privacy laws like the General Data Protection Right in the European Union
Explainability is one thing; interpreting it rightly (for the good of society), is another.
— Murat Durmus, CEO of AISOMA
Key Terms
Machine Learning: A branch of AI that focuses on developing algorithms and techniques that allow computers to learn from and make predictions based on data without being explicitly programmed. Essentially, machine learning permits computers to find patterns and relationships in data and use that information to improve their performance over time.
Black box: An AI model whose operation is not visible nor understandable to human users. The logic behind the predictions made by the model remains untraceable and unknowable. An example of a black box AI model is a deep neural network (DNN), where the complexity and interconnectivity of the layers make it difficult to interpret the decision-making process.
Glass box (or white box): An AI model that is transparent about its decision-making process and the logic that led to its predictions. Humans can examine the model's reasoning and understand why a particular prediction was made. This transparency helps in debugging, trust-building, and regulatory compliance.
Algorithm: A set of instructions or a sequence of computational steps designed to perform a specific task.
Right to Explanation: A regulation that grants individuals the right to be provided with an explanation of the output generated by a computer algorithm. This right ensures that users can understand how and why a particular decision was made, which is crucial for accountability and trust in automated systems.
Decision trees: A supervised machine learning algorithm that is used for both classification and regression tasks. They model decisions and their possible consequences as a tree-like structure.
Model visualization: Model visualization involves creating visual representations of machine learning models and their operations to help understand how decisions are made. This can include visualizing the structure of a model (like a neural network's layers), the weights and activations in the model, or even the transformations that data undergoes as it passes through the model.
Feature importance: This technique identifies the importance of each input variable in the output of a predictive model. This shows users what was most influential in determining the AI system’s response.
Rule extraction: A method used to derive human-understandable rules from complex models. These rules are intended to approximate the decision-making process of the model in a simpler, more interpretable form.
History
The idea that computers should be able to explain their results has existed as long as the research on AI systems began. It especially took hold in the 1980s when Terry Winograd and Fernando Flores discussed the problems with explanations and transparency in computer systems.5 During this period, the current AI technology could explain their reasoning for an output whether it be for diagnostic, instructional, or educational purposes.
Like many things, time and expertise result in advancements. AI technology has become increasingly sophisticated with the introduction of machine learning and deep learning subfields of AI. These AI models use algorithms to estimate outcomes without being explicitly programmed to do so. In other words, they can learn from the given data and produce an output with limited human intervention and expertise.
Given this, the output of sophisticated AI technology is unexplainable, neither by the AI itself, an external source, or the system’s developer.6 This has put the output of AI in a gray area when it comes to trustworthiness. These concerns grow as AI becomes more widespread despite its potential to be biased, leading to demands for transparency.
The field has devised various methods to tackle transparency concerns in AI, ranging from establishing an association dedicated to its research advocacy, to, notably, the advancement of Explainable AI.
The XAI system aims to shift from the black box model of AI to a glass box and it does so in two approaches. The first is the transparency design, where the system tries to explain how the AI model is functioning from the developer's perspective, using decision trees and regression analyses. The second is the post-hoc explanation, where the system explains why a particular result was inferred from the perspective of a user, through data visualizations and statements of analyses. Together, the efforts of XAI aim to increase overall trust in AI technology.
Consequences
As businesses continue to tap into AI’s potential for various applications, XAI shows tremendous promise. Something as simple as being able to explain its processes enhances the potential of AI, here are some examples:
1. Increase trust and credibility
Despite widespread usage of AI technology, public trust in it seems to be following the opposite trajectory.7 Little trust in AI stalls our ability to maximize the powers of this technology. But with XAI having the ability to demystify the process behind AI-generated results, we can counter this effect. Being able to understand how something works contributes to people’s ability to trust it.
2. Mitigate bias
AI technology is not perfect. The data it trained on or given can cause AI to inadvertently perpetuate bias. Airing out how AI has come to a result can allow us to dissect its decision-making processes and catch its flaws. By identifying a problematic aspect of its decision-making, we can come up with solutions to fix it.
3. Facilitate collaboration between humans and AI
Humans are no strangers to collaboration. We have been doing it for centuries. But we don’t just collaborate to foster harmony or a sense of community.8 With that, XAI invites human intervention. The synergy provides the chance to use human creativity, capability, and intuition with the power of AI. We may end up with something greater when we fuse the strengths of one another.
behavior change 101
Start your behavior change journey at the right place
The Human Element
XAI has the power to increase people’s trust in AI technology by simply turning the black box into a glass box. But what psychological factors influence people’s acceptance of AI explanations? First, let’s talk about why we may be resistant to AI in the first place.
1. Emotionlessness of AI
We have an innate pull to understand the world around us—including non-human entities. We often engage in anthropomorphism, ascribing human qualities to non-humans. We do this especially with AI technology given their cognitive similarity to humans. Despite anthropomorphizing AI, we are still aware of its clear inability to emote. And because emotion plays a role in subjective decisions, like opinions and intuition, we are hesitant to AI when it comes to emotional decision-making.
We view AI as unfeeling rational machines so its output may be more of a reflection of what we think but not what we feel. This ultimately makes us resistant to AI-driven results, especially when the question asked involves complex emotions.9
2. Perception that AI is rigid10
We don’t view ourselves as inflexible beings. We can adapt and change. With that, we believe in our ability to learn from our mistakes and move past our flaws. But we don’t always view AI as having learning capabilities. This may be influenced by how computers functioned in the past. Their simple and non-adaptive algorithms keep computers straight-laced. Research shows that the label, “machine learning”, elicits more human preference compared to “algorithm” because the computer’s capability to learn is pushed to the forefront.11 This shows that the perception of AI rigidness may contribute to human resistance to it.
How can we overcome these factors? It turns out that building a sense of autonomy in the user is a major influencer.
The feeling of autonomy is a fundamental human motive.12 Maladaptive behaviors arise when we feel that we don’t have control. The fact that AI technology can function without human intervention creates a sense of discomfort in us. For example, 76% of Americans feel that their safety is at risk when riding in cars with self-driving features.13 So, AI technology that restores a sense of control in human users can encourage acceptance. This may be one of the reasons why people are accepting of AI explanations. We are open to AI explanations because their transparency makes us able to challenge them. Our ability to intervene creates a feeling of autonomy which may contribute to our acceptance of AI explanations.
Controversies
The transparency provided by XAI seems to help foster trust amongst users. But there is a historical assumption that the higher the AI model’s explainability, the lower its prediction accuracy. This is known as the accuracy-explainability trade-off. And with that, let’s dissect it.
A black box AI model is considered a black box because its decision-making processes are not interpretable to humans. It is uninterpretable not only because humans cannot see how the model makes decisions but also because the processes are so complex. Their predictions are informed by billions of parameters. The information overload theory tells us that humans can only understand AI models with up to seven parameters.14 So, the more complex the AI model, the more incomprehensible it is to us. However, because their predictions are driven by billions of parameters, it makes them more accurate. When we introduce an explainability quality to the AI model, we sacrifice its predictive accuracy because the model has to be less complex for it to be explained, hence the accuracy-explainability trade-off.
Interestingly, emerging evidence seems to debunk the existence of a trade-off. Researchers conducted a large-scale analysis, comparing the performance of black box and glass box (less complex) AI models on an array of datasets. The results found that black box and glass box models lead to similarly accurate results for nearly 70% of the datasets.15 It appears that more often than not, there isn’t a trade-off. So, having an explainable AI model doesn’t necessarily sacrifice accuracy.
Case Study
DARPA XAI Program
The Defense Advanced Research Projects Agency (DARPA) developed an XAI program that aims to create glass box AI models. These AI systems invite human intervention by promoting explainability while maintaining high predictive accuracy. On top of being able to explain their decision-making, the new systems created under the program can also identify their strengths and weaknesses. The systems would also have a human-computer interface capable of translating their decision-making into understandable and useful language for the end user. That way, rather than feeling confused about an AI model’s prediction, the end user can feel more confident.
Related TDL Content
AI algorithms at work: How to use AI to help overcome historical biases
We mentioned that AI technology is not without its flaws. Like humans, they also have the potential to perpetuate biases. This has transformed into one of the reasons people have trouble trusting AI. As it turns out, AI itself can be utilized to mitigate this problem of biased AI. Read more about it in TDL’s article.
Humans and AI: Rivals or Romance?
XAI makes transparent the complex decision-making processes of AI models. While this helps to promote people’s trust in AI-generated results, it also opens doors to the potential of AI and humans working together, symbiotically. This TDL article highlights just how powerful integrating humans and machines can be. Rather than viewing AI as a threat, we should realize how collaboration can help us achieve new heights.
References
- IBM. (n.d.). What is explainable AI? IBM.com. https://www.ibm.com/topics/explainable-ai
- DARPA. (2018). Explainable Artificial Intelligence (XAI). Darpa.mil. https://www.darpa.mil/program/explainable-artificial-intelligence
- McKendrick, J. (2021, August 30). Artificial Intelligence’s Biggest Stumbling Block: Trust. Forbes. https://www.forbes.com/sites/joemckendrick/2021/08/30/artificial-intelligences-biggest-stumbling-block-trust/?sh=ec053067cb35
- Edwards, L., & Veale, M. (2017). Slave to the Algorithm? Why a “Right to Explanation” Is Probably Not the Remedy You Are Looking for. SSRN Electronic Journal, 16(18). https://doi.org/10.2139/ssrn.2972855
- Winograd, T., & Flores, F. (2008). Understanding Computers and Cognition: A New Foundation for Design. Addison-Wesley.
- Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., & Zhu, J. (2019). Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges. Natural Language Processing and Chinese Computing, 11839, 563–574. https://doi.org/10.1007/978-3-030-32236-6_51
- Marr, B. (2024, March 19). As AI Expands, Public Trust Seems To Be Falling. Forbes. https://www.forbes.com/sites/bernardmarr/2024/03/19/is-the-public-losing-trust-in-ai/?sh=6ae12d9e29b2
- Yoon, S. W., Matsui, M., Yamada, T., & Nof, S. Y. (2009). Analysis of effectiveness and benefits of collaboration modes with information- and knowledge-sharing. Journal of Intelligent Manufacturing, 22(1), 101–112. https://doi.org/10.1007/s10845-009-0282-x
- De Freitas, J., Agarwal, S., Schmitt, B., & Haslam, N. (2023). Psychological factors underlying attitudes toward AI tools. Nature Human Behaviour, 7(11), 1845–1854. https://doi.org/10.1038/s41562-023-01734-2
- See 9.
- Reich, T., Kaju, A., & Maglio, S. (2022). How to overcome algorithm aversion: Learning from mistakes. Journal of Consumer Psychology, 33. https://doi.org/10.1002/jcpy.1313
- See 9.
- Brennan, R., & Sachon, L. (2022, September 14). Self-driving cars make 76% of Americans feel less safe on the road. Policygenius. https://www.policygenius.com/auto-insurance/self-driving-cars-survey-2022/
- Candelon, F., Evgeniou, T., & Martens, D. (2023, May 12). AI Can Be Both Accurate and Transparent. Harvard Business Review. https://hbr.org/2023/05/ai-can-be-both-accurate-and-transparent
- See 14.
About the Author
Samantha Lau
Samantha graduated from the University of Toronto, majoring in psychology and criminology. During her undergraduate degree, she studied how mindfulness meditation impacted human memory which sparked her interest in cognition. Samantha is curious about the way behavioural science impacts design, particularly in the UX field. As she works to make behavioural science more accessible with The Decision Lab, she is preparing to start her Master of Behavioural and Decision Sciences degree at the University of Pennsylvania. In her free time, you can catch her at a concert or in a dance studio.