When artificial intelligence (AI) first burst into the public consciousness with the launch of ChatGPT in late 2022, many people saw the technology as a helpful chatbot.
They found an AI-powered chatbot could help with anything from answering a question to generating everything from text to computer code. Popularity and usage grew exponentially.
Fast forward almost three years and things have changed significantly with the emergence of Agentic AI. This technology can perform multi-step tasks, invoke APIs, run commands, and write and deploy code autonomously.
AI agents go much further than responding to prompts – they’re actually making decisions. While this will make the tools even more useful, it also poses security risks. Once an IT system starts taking autonomous actions, safety and control become paramount.
A challenge two years in the making
The challenge posed by Agentic AI was first flagged back in 2023 with the release of the OWASP Top 10 for LLM Applications report[1]. In it the term ‘excessive agency’ was coined.
The argument was that, if an AI model is given too much autonomy, it begins to act more like a free agent than a bounded assistant. It might be able to schedule meetings or book conference rooms, however it could also delete files or perhaps provision excessive cloud infrastructure.
If not deployed and managed carefully, AI agents can start to behave like a confused deputy. They could even become sleeper agents just waiting to be exploited in a cybersecurity incident.
These are more than just idle predictions. In recent real-world examples agents from major software products like Microsoft Copilot[2] and Salesforce’s Slack tool[3] were both shown to be vulnerable to being tricked into using their escalated privileges to exfiltrate sensitive data.
Standards and protocols
During 2025, there has been a wave of new standards and protocols designed to handle the rising capabilities of AI agents. The most prominent of these is Anthropic’s Model Context Protocol (MCP) which is a mechanism for maintaining shared memory, task structures, and tool access across long-lived AI agent sessions.
MCP can be considered as the ‘glue’ that holds an agent’s context together across tools and time. It enables users to tell an agent what they are allowed to do and what they should remember.
While MCP is a much-needed step, it has also raised new questions. This is because the focus with MCP has been on expanding what agents can do, rather than reining them in.
While the protocol helps co-ordinate tool use and preserve memory across agent tasks, it doesn’t yet address critical concerns like prompt injection resistance which is when an attacker manipulates shared memory.
MCP also doesn’t tackle command scoping, where an agent is tricked into exceeding its permissions or token abuse which is when a ‘Leaked Memory Blob’ can be used to expose API credentials or user data.
Unfortunately, these are not theoretical problems. A recent examination of security implications revealed that MCP-style architectures are vulnerable to prompt injection, command misuse, and even memory poisoning, especially when shared memory is not adequately scoped or encrypted.
An issue requiring immediate attention
This is not a problem that can be ignored as it relates to tools that many developers are already using. Coding agents like Claude Code and Cursor are gaining real traction inside enterprise workflows and delivering significant benefits.
GitHub’s internal research showed Copilot could speed up tasks by 55%. More recently, Anthropic reported 79% of Claude Code usage was focused on automated task execution, and not just code suggestions.
This represents a significant productivity boost, but shows the tools are no longer simply copilots – they’re actually flying solo.
Also, it’s not just software development as MCP is now being integrated into tools that extend beyond coding. These cover activities such as email triage, meeting preparation, sales planning, document summarisation, and other high-leverage productivity tasks.
While many of these use cases are still in their early stages, they’re maturing rapidly, and this changes the stakes. It demands attention from business unit leaders, CIOs, CISOs, and Chief AI Officers alike.
Preparation is essential
As these agents begin accessing sensitive data and executing cross-functional workflows, organisations must ensure that governance, risk management, and strategic planning are integral from the outset. Integrating autonomous agents into a business without proper controls is a recipe for outages, data leaks, and regulatory blowback.
There are some key steps that should be taken. One is to launch agent pilot programs, but also to require code reviews, tool permissions, and sandboxing.
Agent autonomy should also be limited to what’s actually necessary as not every agent needs root access or long-term memory. Developers and product teams should also be trained on safe usage patterns, including scope control, fallback behaviours, and escalation paths.
Organisations that regard AI agents as a part of core infrastructure – rather than novelty tools – will be best placed to enjoy the benefits. The time for considering and acting on the associated challenges is now.