Connect with us

Ethics & Policy

What explainable AI is, why it matters and how we can achieve it

Published

on


AI is reshaping critical sectors of society like healthcare, finance, and justice. From diagnosing diseases, deciding loan approvals, to judicial outcomes, AI’s decisions can deeply affect our lives. But can we trust these systems when their inner workings remain hidden, locked away in complex computational models such as deep neural networks that humans can only perceive as opaque “black boxes”? The need for greater transparency and trustworthiness in AI is becoming increasingly important as these systems become widely deployed, especially in critical sectors.

Explainable AI, in contrast, utilises techniques to clarify how AI models operate, allowing us to trust, verify, and responsibly use these advanced technologies. However, explainable AI is not a perfect solution and faces its own difficulties when attempting to elucidate the machine learning models that underpin AI. The issues surrounding explainability become even more pronounced with large language models (LLMs), which are among the most popular types of AI today. 

For stakeholders such as policymakers, regulators, deployers and end-users of AI technology, it’s helpful to have a basic understanding of explainable AI concepts, methods and approaches, but also its potential and limitations. This can help form realistic demands and raise the explainability standards for AI systems used in future.

Why do we need explainability?

Modern AI can perform impressive tasks, ranging from driving cars and predicting protein folding to designing drugs and writing complex legal texts. Yet, despite these successes, AI systems often operate opaquely, making it challenging to understand or trust their outputs. This lack of transparency isn’t only inconvenient; it poses security, legal, ethical, and practical risks. For instance, an AI system that denies a loan must explain its reasoning to ensure decisions aren’t biased or arbitrary.

Especially when AI is involved in sensitive decisions, such as medical diagnoses and judicial rulings, the necessity for transparency becomes even more pronounced. Without adequate explanations, AI decisions risk violating legal rights, perpetuating biases, or leading to unintended, harmful consequences.

Decisions by AI systems must be understandable, verifiable, and accountable. An increasing number of researchers, regulators, and users recognise that without sufficient explainability, AI cannot become a trusted technology in decision-making processes.

Explainable AI addresses these needs by enabling users and decision-makers to understand the logic behind a model’s operation. It aims to bridge the gap between AI’s performance and our need for understanding its behaviour.

Understanding explainable AI

Explainable AI refers to the ability of an AI model to clearly explain its functioning in a way that humans can understand. This goes beyond technical clarity and involves several related concepts:

  • Transparency: Users can access information about the internal workings of the AI system.
  • Accountability: AI decisions can be traced, and responsibility can be clearly assigned.
  • Traceability: The ability to reconstruct decision-making processes, critical for auditing and oversight purposes.
  • Interpretability: Users can easily understand how input data leads to specific outcomes.
  • Trustworthiness: AI systems consistently produce reliable, fair, and ethically sound results.

The broader term trustworthy AI is used to describe systems that are predictable, robust, fair, ethically designed, and aligned with social and legal norms. Trust is built on understanding. If users grasp the logic of a system, they are more likely to follow its recommendations. Through explainability, AI evolves from a mysterious technology into a tool that humans can control, oversee, and trust.

It’s also worth mentioning that explainability varies depending on the audience. What a data scientist finds clear may confuse a judge or a regulator. Therefore, explainable AI must translate complex AI operations into understandable explanations tailored for specific audiences, ensuring the practical usability of AI across diverse contexts.

AI’s performance-interpretability trade-off

Many powerful AI models, such as deep neural networks, provide impressive accuracy in tasks like image recognition and natural language processing today. They identify complex patterns from vast data sets without explicit human-programmed rules. However, these advanced capabilities come at a cost: these models do not readily offer explanations for their outputs. They are notoriously difficult to interpret due to non-linear dependencies, large numbers of parameters, and multi-layered structures. As a result, the inner workings are often incomprehensible, even to experts. This combination of high performance and low interpretability thus creates a “performance-interpretability trade-off” dynamic.

This creates a dilemma: should we prefer simpler, less accurate models with transparent operations, or complex, high-performance models that remain opaque? The choice depends significantly on the application’s context. In fields where the consequences are substantial, even marginally lower accuracy may be acceptable if it significantly increases the comprehensibility and trustworthiness of an AI system’s decisions.

Approaches to explainability

Explainability approaches generally fall into two categories:

Prioritising intrinsically interpretable models

Interpretable models, such as linear and logistic regression, decision trees or other simpler rule-based systems, inherently provide transparency because their decision processes are straightforward and easily traceable. They enable users to understand directly how specific data points influence outcomes, facilitating verification and auditing. For explainability purposes, it is therefore advantageous to select these models whenever possible.

Applying post-hoc explanation methods

For more complex models, such as random forest and various neural networks, post-hoc explanations can clarify outputs (e.g., decisions or generated content) after they have been made without interfering with the models themselves. These include:

  • Model-specific methods: Tailored explanations for specific model types (e.g., visualisation of decision trees).
  • Model-agnostic methods: General techniques applicable across various model types that do not require access to internal structures and rely solely on input/output analysis (e.g. permutation feature importance).
  • Local explanations: These focus on individual predictions. Methods such as SHAP or LIME provide clarity on how each input variable contributes to specific decisions.
  • Global explanations: Provide insights into overall model behaviour, identifying general rules and decision patterns across the data (again SHAP or LIME); essential for evaluating fairness and regulatory compliance.

Combining these methods often yields the most effective solutions by tailoring explanations to user needs and context.

Explainable AI’s challenges

Despite its potential, explainable AI still faces substantial technological hurdles:

  • Inconsistent methods: A lack of standardised evaluation for explainability methods themselves makes it difficult to ensure reliability and reproducibility, making it hard to determine which explanation is “true” or most useful.
  • Handling unstructured data: Traditional methods struggle with AI systems that process complex unstructured data like text or images, where context heavily influences meaning, arising not just from individual words but from context, syntax, and implicit references.
  • Computational limitations: Running explainability methods on large AI models, which have billions of parameters, such as LLMs, requires significant computational power, making real-time explanations challenging.
  • Effective communication: Even accurate explanations become ineffective if they are neither understandable nor actionable for the intended audience (e.g. lawyers, judges, or regulators).

The multidimensional nature of unstructured data is an added challenge, especially for text. This is particularly true for explainability methods and creates a need for innovative approaches to maintain effectiveness.

Explaining LLMs

Today’s widely used LLMs, such as ChatGPT, Claude, or Gemini, are paramount to modern AI applications in generating natural language and present unique challenges. Although highly capable of generating coherent, context-rich content, their size and complexity make it very hard to explain why the model gave a particular answer, connected certain concepts, or excluded others. Determining why they produce certain outputs or biases is very challenging, making it the ultimate “black box”.

LLMs also often reflect biases present in their training data. Without explainability, it is difficult to determine whether generated content is discriminatory, incorrect, or unethical. This means using such models in many contexts is risky, such as judicial, potentially leading to systemic errors or human rights violations.

Explaining text-based outputs is a challenge. Text data are high-dimensional, context-sensitive, and rich in implicit meaning, and conventional explainability methods struggle to capture complex conceptual relationships. For instance, LLMS may produce an accurate legal summary; however, it remains unclear which parts of the original document were deemed relevant and which legal knowledge was implicitly applied.

Since these foundation models offer the underlying technology for the rapidly evolving field of Agentic AI, which will have widespread real-world impact, it is important to assess to what extent current explainability approaches can improve oversight of model behaviour.

To date, there are several approaches, typically categorised by how models are trained and utilised. Zhao et al. (2024) classify explainability techniques for LLMs into two paradigms: 

  • Fine-tuning: Involves adapting a pre-trained model on a smaller, domain-specific dataset to enhance task performance (e.g., legal interpretation, medical diagnostics); Here, the explanations focus on how fine-tuning affects attention structures, semantic representation, or key entities and phrases.
  • Prompting: Uses specially crafted inputs (instructions, questions, examples) without altering the model itself. Explanatory methods analyse which prompt elements influenced the response or how context shaped the outcome. Since internal representations are not accessible, these explanations often rely on changes to input/output.

Understanding behaviour in either paradigm is difficult due to the highly non-linear architecture of LLMs. Minor changes to input can lead to significant differences in the output, complicating the ability to provide stable, repeatable explanations.

Other experimental methods include attention visualisationtoken influence analysis, and contrastive learning to detect key variations in output. Together, these techniques form a rapidly growing research field aimed at revealing LLMs’ inner workings to support reliability, fairness, and regulatory compliance. For more in-depth information, see Zhao et al. (2024).

Explainability as standard practice

As described, explainable AI is a broader paradigm that applies special methods and approaches to improve our understanding of complex AI models. It is a crucial component on the way to trustworthy AI and is therefore linked to concepts such as transparency, accountability, traceability, interpretability and trust in AI. Prioritising explainability sets the foundation for AI technology to become a supportive tool for humanity based on reinforced trust, enhanced fairness, and alignment with legal and ethical norms. 

There are still many explainability challenges for AI, particularly concerning widely used, complex LLMs. For now, deployers and end-users of AI face challenging trade-offs between model performance and interpretability. What is more, AI may never be perfectly transparent, just as human reasoning always has a degree of opacity. But this should not diminish the ongoing quest for oversight and accountability when applying such a powerful and influential technology.

On the contrary, these limitations should motivate a more serious and committed pursuit of explainability. On the path towards efficient, safe and responsible AI deployment, explainability should be a core design principle and become a universal standard that steers future AI research, regulation, and institutional adaptation.

The post What explainable AI is, why it matters and how we can achieve it appeared first on OECD.AI.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Ethics & Policy

AI and ethics – what is originality? Maybe we’re just not that special when it comes to creativity?

Published

on


I don’t trust AI, but I use it all the time.

Let’s face it, that’s a sentiment that many of us can buy into if we’re honest about it. It comes from Paul Mallaghan, Head of Creative Strategy at We Are Tilt, a creative transformation content and campaign agency whose clients include the likes of Diageo, KPMG and Barclays.

Taking part in a panel debate on AI ethics at the recent Evolve conference in Brighton, UK, he made another highly pertinent point when he said of people in general:

We know that we are quite susceptible to confident bullshitters. Basically, that is what Chat GPT [is] right now. There’s something reminds me of the illusory truth effect, where if you hear something a few times, or you say it here it said confidently, then you are much more likely to believe it, regardless of the source. I might refer to a certain President who uses that technique fairly regularly, but I think we’re so susceptible to that that we are quite vulnerable.

And, yes, it’s you he’s talking about:

I mean all of us, no matter how intelligent we think we are or how smart over the machines we think we are. When I think about trust, – and I’m coming at this very much from the perspective of someone who runs a creative agency – we’re not involved in building a Large Language Model (LLM); we’re involved in using it, understanding it, and thinking about what the implications if we get this wrong. What does it mean to be creative in the world of LLMs?

Genuine

Being genuine, is vital, he argues, and being human – where does Human Intelligence come into the picture, particularly in relation to creativity. His argument:

There’s a certain parasitic quality to what’s being created. We make films, we’re designers, we’re creators, we’re all those sort of things in the company that I run. We have had to just face the fact that we’re using tools that have hoovered up the work of others and then regenerate it and spit it out. There is an ethical dilemma that we face every day when we use those tools.

His firm has come to the conclusion that it has to be responsible for imposing its own guidelines here  to some degree, because there’s not a lot happening elsewhere:

To some extent, we are always ahead of regulation, because the nature of being creative is that you’re always going to be experimenting and trying things, and you want to see what the next big thing is. It’s actually very exciting. So that’s all cool, but we’ve realized that if we want to try and do this ethically, we have to establish some of our own ground rules, even if they’re really basic. Like, let’s try and not prompt with the name of an illustrator that we know, because that’s stealing their intellectual property, or the labor of their creative brains.

I’m not a regulatory expert by any means, but I can say that a lot of the clients we work with, to be fair to them, are also trying to get ahead of where I think we are probably at government level, and they’re creating their own frameworks, their own trust frameworks, to try and address some of these things. Everyone is starting to ask questions, and you don’t want to be the person that’s accidentally created a system where everything is then suable because of what you’ve made or what you’ve generated.

Originality

That’s not necessarily an easy ask, of course. What, for example, do we mean by originality? Mallaghan suggests:

Anyone who’s ever tried to create anything knows you’re trying to break patterns. You’re trying to find or re-mix or mash up something that hasn’t happened before. To some extent, that is a good thing that really we’re talking about pattern matching tools. So generally speaking, it’s used in every part of the creative process now. Most agencies, certainly the big ones, certainly anyone that’s working on a lot of marketing stuff, they’re using it to try and drive efficiencies and get incredible margins. They’re going to be on the race to the bottom.

But originality is hard to quantify. I think that actually it doesn’t happen as much as people think anyway, that originality. When you look at ChatGPT or any of these tools, there’s a lot of interesting new tools that are out there that purport to help you in the quest to come up with ideas, and they can be useful. Quite often, we’ll use them to sift out the crappy ideas, because if ChatGPT or an AI tool can come up with it, it’s probably something that’s happened before, something you probably don’t want to use.

More Human Intelligence is needed, it seems:

What I think any creative needs to understand now is you’re going to have to be extremely interesting, and you’re going to have to push even more humanity into what you do, or you’re going to be easily replaced by these tools that probably shouldn’t be doing all the fun stuff that we want to do. [In terms of ethical questions] there’s a bunch, including the copyright thing, but there’s partly just [questions] around purpose and fun. Like, why do we even do this stuff? Why do we do it? There’s a whole industry that exists for people with wonderful brains, and there’s lots of different types of industries [where you] see different types of brains. But why are we trying to do away with something that allows people to get up in the morning and have a reason to live? That is a big question.

My second ethical thing is, what do we do with the next generation who don’t learn craft and quality, and they don’t go through the same hurdles? They may find ways to use {AI] in ways that we can’t imagine, because that’s what young people do, and I have  faith in that. But I also think, how are you going to learn the language that helps you interface with, say, a video model, and know what a camera does, and how to ask for the right things, how to tell a story, and what’s right? All that is an ethical issue, like we might be taking that away from an entire generation.

And there’s one last ‘tough love’ question to be posed:

What if we’re not special?  Basically, what if all the patterns that are part of us aren’t that special? The only reason I bring that up is that I think that in every career, you associate your identity with what you do. Maybe we shouldn’t, maybe that’s a bad thing, but I know that creatives really associate with what they do. Their identity is tied up in what it is that they actually do, whether they’re an illustrator or whatever. It is a proper existential crisis to look at it and go, ‘Oh, the thing that I thought was special can be regurgitated pretty easily’…It’s a terrifying thing to stare into the Gorgon and look back at it and think,’Where are we going with this?’. By the way, I do think we’re special, but maybe we’re not as special as we think we are. A lot of these patterns can be matched.

My take

This was a candid worldview  that raised a number of tough questions – and questions are often so much more interesting than answers, aren’t they? The subject of creativity and copyright has been handled at length on diginomica by Chris Middleton and I think Mallaghan’s comments pretty much chime with most of that.

I was particularly taken by the point about the impact on the younger generation of having at their fingertips AI tools that can ‘do everything, until they can’t’. I recall being horrified a good few years ago when doing a shift in a newsroom of a major tech title and noticing that the flow of copy had suddenly dried up. ‘Where are the stories?’,  I shouted. Back came the reply, ‘Oh, the Internet’s gone down’.  ‘Then pick up the phone and call people, find some stories,’ I snapped. A sad, baffled young face looked back at me and asked, ‘Who should we call?’. Now apart from suddenly feeling about 103, I was shaken by the fact that as soon as the umbilical cord of the Internet was cut, everyone was rendered helpless. 

Take that idea and multiply it a billion-fold when it comes to AI dependency and the future looks scary. Human Intelligence matters



Source link

Continue Reading

Ethics & Policy

Preparing Timor Leste to embrace Artificial Intelligence

Published

on


UNESCO, in collaboration with the Ministry of Transport and Communications, Catalpa International and national lead consultant, jointly conducted consultative and validation workshops as part of the AI Readiness assessment implementation in Timor-Leste. Held on 8–9 April and 27 May respectively, the workshops convened representatives from government ministries, academia, international organisations and development partners, the Timor-Leste National Commission for UNESCO, civil society, and the private sector for a multi-stakeholder consultation to unpack the current stage of AI adoption and development in the country, guided by UNESCO’s AI Readiness Assessment Methodology (RAM).

In response to growing concerns about the rapid rise of AI, the UNESCO Recommendation on the Ethics of Artificial Intelligence was adopted by 194 Member States in 2021, including Timor-Leste, to ensure ethical governance of AI. To support Member States in implementing this Recommendation, the RAM was developed by UNESCO’s AI experts without borders. It includes a range of quantitative and qualitative questions designed to gather information across different dimensions of a country’s AI ecosystem, including legal and regulatory, social and cultural, economic, scientific and educational, technological and infrastructural aspects.

By compiling comprehensive insights into these areas, the final RAM report helps identify institutional and regulatory gaps, which can assist the government with the necessary AI governance and enable UNESCO to provide tailored support that promotes an ethical AI ecosystem aligned with the Recommendation.

The first day of the workshop was opened by Timor-Leste’s Minister of Transport and Communication, H.E. Miguel Marques Gonçalves Manetelu. In his opening remarks, Minister Manetelu highlighted the pivotal role of AI in shaping the future. He emphasised that the current global trajectory is not only driving the digitalisation of work but also enabling more effective and productive outcomes.



Source link

Continue Reading

Ethics & Policy

Experts gather to discuss ethics, AI and the future of publishing

Published

on

By


Representatives of the founding members sign the memorandum of cooperation at the launch of the Association for International Publishing Education during the 3rd International Conference on Publishing Education in Beijing.CHINA DAILY

Publishing stands at a pivotal juncture, said Jeremy North, president of Global Book Business at Taylor & Francis Group, addressing delegates at the 3rd International Conference on Publishing Education in Beijing. Digital intelligence is fundamentally transforming the sector — and this revolution will inevitably create “AI winners and losers”.

True winners, he argued, will be those who embrace AI not as a replacement for human insight but as a tool that strengthens publishing’s core mission: connecting people through knowledge. The key is balance, North said, using AI to enhance creativity without diminishing human judgment or critical thinking.

This vision set the tone for the event where the Association for International Publishing Education was officially launched — the world’s first global alliance dedicated to advancing publishing education through international collaboration.

Unveiled at the conference cohosted by the Beijing Institute of Graphic Communication and the Publishers Association of China, the AIPE brings together nearly 50 member organizations with a mission to foster joint research, training, and innovation in publishing education.

Tian Zhongli, president of BIGC, stressed the need to anchor publishing education in ethics and humanistic values and reaffirmed BIGC’s commitment to building a global talent platform through AIPE.

BIGC will deepen academic-industry collaboration through AIPE to provide a premium platform for nurturing high-level, holistic, and internationally competent publishing talent, he added.

Zhang Xin, secretary of the CPC Committee at BIGC, emphasized that AIPE is expected to help globalize Chinese publishing scholarships, contribute new ideas to the industry, and cultivate a new generation of publishing professionals for the digital era.

Themed “Mutual Learning and Cooperation: New Ecology of International Publishing Education in the Digital Intelligence Era”, the conference also tackled a wide range of challenges and opportunities brought on by AI — from ethical concerns and content ownership to protecting human creativity and rethinking publishing values in higher education.

Wu Shulin, president of the Publishers Association of China, cautioned that while AI brings major opportunities, “we must not overlook the ethical and security problems it introduces”.

Catriona Stevenson, deputy CEO of the UK Publishers Association, echoed this sentiment. She highlighted how British publishers are adopting AI to amplify human creativity and productivity, while calling for global cooperation to protect intellectual property and combat AI tool infringement.

The conference aims to explore innovative pathways for the publishing industry and education reform, discuss emerging technological trends, advance higher education philosophies and talent development models, promote global academic exchange and collaboration, and empower knowledge production and dissemination through publishing education in the digital intelligence era.

 

 

 



Source link

Continue Reading

Trending