Explainable AI (XAI)


The idea that computers should be able to explain their results has existed as long as the research on AI systems began. It especially took hold in the 1980s when Terry Winograd and Fernando Flores discussed the problems with explanations and transparency in computer systems.5 During this period, the current AI technology could explain their reasoning for an output whether it be for diagnostic, instructional, or educational purposes.

Like many things, time and expertise result in advancements. AI technology has become increasingly sophisticated with the introduction of machine learning and deep learning subfields of AI. These AI models use algorithms to estimate outcomes without being explicitly programmed to do so. In other words, they can learn from the given data and produce an output with limited human intervention and expertise.

Given this, the output of sophisticated AI technology is unexplainable, neither by the AI itself, an external source, or the system’s developer.6 This has put the output of AI in a gray area when it comes to trustworthiness. These concerns grow as AI becomes more widespread despite its potential to be biased, leading to demands for transparency.

The field has devised various methods to tackle transparency concerns in AI, ranging from establishing an association dedicated to its research advocacy, to, notably, the advancement of Explainable AI. 

The XAI system aims to shift from the black box model of AI to a glass box and it does so in two approaches. The first is the transparency design, where the system tries to explain how the AI model is functioning from the developer's perspective, using decision trees and regression analyses. The second is the post-hoc explanation, where the system explains why a particular result was inferred from the perspective of a user, through data visualizations and statements of analyses. Together, the efforts of XAI aim to increase overall trust in AI technology.


As businesses continue to tap into AI’s potential for various applications, XAI shows tremendous promise. Something as simple as being able to explain its processes enhances the potential of AI, here are some examples:

1. Increase trust and credibility
Despite widespread usage of AI technology, public trust in it seems to be following the opposite trajectory.7 Little trust in AI stalls our ability to maximize the powers of this technology. But with XAI having the ability to demystify the process behind AI-generated results, we can counter this effect. Being able to understand how something works contributes to people’s ability to trust it.

2. Mitigate bias
AI technology is not perfect. The data it trained on or given can cause AI to inadvertently perpetuate bias. Airing out how AI has come to a result can allow us to dissect its decision-making processes and catch its flaws. By identifying a problematic aspect of its decision-making, we can come up with solutions to fix it.

3. Facilitate collaboration between humans and AI
Humans are no strangers to collaboration. We have been doing it for centuries. But we don’t just collaborate to foster harmony or a sense of community.8 With that, XAI invites human intervention. The synergy provides the chance to use human creativity, capability, and intuition with the power of AI. We may end up with something greater when we fuse the strengths of one another.

The Human Element

XAI has the power to increase people’s trust in AI technology by simply turning the black box into a glass box. But what psychological factors influence people’s acceptance of AI explanations? First, let’s talk about why we may be resistant to AI in the first place.

1. Emotionlessness of AI
We have an innate pull to understand the world around us—including non-human entities. We often engage in anthropomorphism, ascribing human qualities to non-humans. We do this especially with AI technology given their cognitive similarity to humans. Despite anthropomorphizing AI, we are still aware of its clear inability to emote. And because emotion plays a role in subjective decisions, like opinions and intuition, we are hesitant to AI when it comes to emotional decision-making. 

We view AI as unfeeling rational machines so its output may be more of a reflection of what we think but not what we feel. This ultimately makes us resistant to AI-driven results, especially when the question asked involves complex emotions.9

2. Perception that AI is rigid10
We don’t view ourselves as inflexible beings. We can adapt and change. With that, we believe in our ability to learn from our mistakes and move past our flaws. But we don’t always view AI as having learning capabilities. This may be influenced by how computers functioned in the past. Their simple and non-adaptive algorithms keep computers straight-laced. Research shows that the label, “machine learning”, elicits more human preference compared to “algorithm” because the computer’s capability to learn is pushed to the forefront.11 This shows that the perception of AI rigidness may contribute to human resistance to it.  

How can we overcome these factors? It turns out that building a sense of autonomy in the user is a major influencer.

The feeling of autonomy is a fundamental human motive.12 Maladaptive behaviors arise when we feel that we don’t have control. The fact that AI technology can function without human intervention creates a sense of discomfort in us. For example, 76% of Americans feel that their safety is at risk when riding in cars with self-driving features.13 So, AI technology that restores a sense of control in human users can encourage acceptance. This may be one of the reasons why people are accepting of AI explanations. We are open to AI explanations because their transparency makes us able to challenge them. Our ability to intervene creates a feeling of autonomy which may contribute to our acceptance of AI explanations.


The transparency provided by XAI seems to help foster trust amongst users. But there is a historical assumption that the higher the AI model’s explainability, the lower its prediction accuracy. This is known as the accuracy-explainability trade-off. And with that, let’s dissect it.

A black box AI model is considered a black box because its decision-making processes are not interpretable to humans. It is uninterpretable not only because humans cannot see how the model makes decisions but also because the processes are so complex. Their predictions are informed by billions of parameters. The information overload theory tells us that humans can only understand AI models with up to seven parameters.14 So, the more complex the AI model, the more incomprehensible it is to us. However, because their predictions are driven by billions of parameters, it makes them more accurate. When we introduce an explainability quality to the AI model, we sacrifice its predictive accuracy because the model has to be less complex for it to be explained, hence the accuracy-explainability trade-off.

Interestingly, emerging evidence seems to debunk the existence of a trade-off. Researchers conducted a large-scale analysis, comparing the performance of black box and glass box (less complex) AI models on an array of datasets. The results found that black box and glass box models lead to similarly accurate results for nearly 70% of the datasets.15 It appears that more often than not, there isn’t a trade-off. So, having an explainable AI model doesn’t necessarily sacrifice accuracy.

Case Study

The Defense Advanced Research Projects Agency (DARPA) developed an XAI program that aims to create glass box AI models. These AI systems invite human intervention by promoting explainability while maintaining high predictive accuracy. On top of being able to explain their decision-making, the new systems created under the program can also identify their strengths and weaknesses. The systems would also have a human-computer interface capable of translating their decision-making into understandable and useful language for the end user. That way, rather than feeling confused about an AI model’s prediction, the end user can feel more confident.

Related TDL Content

AI algorithms at work: How to use AI to help overcome historical biases
We mentioned that AI technology is not without its flaws. Like humans, they also have the potential to perpetuate biases. This has transformed into one of the reasons people have trouble trusting AI. As it turns out, AI itself can be utilized to mitigate this problem of biased AI. Read more about it in TDL’s article.

Humans and AI: Rivals or Romance?
XAI makes transparent the complex decision-making processes of AI models. While this helps to promote people’s trust in AI-generated results, it also opens doors to the potential of AI and humans working together, symbiotically. This TDL article highlights just how powerful integrating humans and machines can be. Rather than viewing AI as a threat, we should realize how collaboration can help us achieve new heights.


  1. IBM. (n.d.). What is explainable AI? IBM.com. https://www.ibm.com/topics/explainable-ai
  2. DARPA. (2018). Explainable Artificial Intelligence (XAI). Darpa.mil. https://www.darpa.mil/program/explainable-artificial-intelligence
  3. McKendrick, J. (2021, August 30). Artificial Intelligence’s Biggest Stumbling Block: Trust. Forbes. https://www.forbes.com/sites/joemckendrick/2021/08/30/artificial-intelligences-biggest-stumbling-block-trust/?sh=ec053067cb35
  4. Edwards, L., & Veale, M. (2017). Slave to the Algorithm? Why a “Right to Explanation” Is Probably Not the Remedy You Are Looking for. SSRN Electronic Journal, 16(18). https://doi.org/10.2139/ssrn.2972855
  5. Winograd, T., & Flores, F. (2008). Understanding Computers and Cognition: A New Foundation for Design. Addison-Wesley.
  6. Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., & Zhu, J. (2019). Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges. Natural Language Processing and Chinese Computing, 11839, 563–574. https://doi.org/10.1007/978-3-030-32236-6_51
  7. Marr, B. (2024, March 19). As AI Expands, Public Trust Seems To Be Falling. Forbes. https://www.forbes.com/sites/bernardmarr/2024/03/19/is-the-public-losing-trust-in-ai/?sh=6ae12d9e29b2
  8. Yoon, S. W., Matsui, M., Yamada, T., & Nof, S. Y. (2009). Analysis of effectiveness and benefits of collaboration modes with information- and knowledge-sharing. Journal of Intelligent Manufacturing, 22(1), 101–112. https://doi.org/10.1007/s10845-009-0282-x
  9. De Freitas, J., Agarwal, S., Schmitt, B., & Haslam, N. (2023). Psychological factors underlying attitudes toward AI tools. Nature Human Behaviour, 7(11), 1845–1854. https://doi.org/10.1038/s41562-023-01734-2
  10. See 9.
  11. Reich, T., Kaju, A., & Maglio, S. (2022). How to overcome algorithm aversion: Learning from mistakes. Journal of Consumer Psychology, 33. https://doi.org/10.1002/jcpy.1313
  12. See 9.
  13. Brennan, R., & Sachon, L. (2022, September 14). Self-driving cars make 76% of Americans feel less safe on the road. Policygenius. https://www.policygenius.com/auto-insurance/self-driving-cars-survey-2022/
  14. Candelon, F., Evgeniou, T., & Martens, D. (2023, May 12). AI Can Be Both Accurate and Transparent. Harvard Business Review. https://hbr.org/2023/05/ai-can-be-both-accurate-and-transparent
  15. See 14. 

About the Author

Samantha Lau

Samantha Lau

Samantha graduated from the University of Toronto, majoring in psychology and criminology. During her undergraduate degree, she studied how mindfulness meditation impacted human memory which sparked her interest in cognition. Samantha is curious about the way behavioural science impacts design, particularly in the UX field. As she works to make behavioural science more accessible with The Decision Lab, she is preparing to start her Master of Behavioural and Decision Sciences degree at the University of Pennsylvania. In her free time, you can catch her at a concert or in a dance studio.

Read Next

Notes illustration

Eager to learn about how behavioral science can help your organization?