The enterprise AI agent deployment wave has outpaced organizational accountability by years. A new function — agent operations — is emerging to fill the gap, and the companies building it now will have a massive advantage when things inevitably go wrong.
Last February, Harvard Business Review published a job description for a role that didn't have a name twelve months earlier.
The piece was titled "To Thrive in the AI Era, Companies Need Agent Managers." It described someone responsible for orchestrating how AI agents learn, collaborate, perform, and work safely alongside humans. Not a machine learning engineer. Not a DevOps lead. Something new — a person whose entire job is the ongoing governance and optimization of autonomous agent systems.
That HBR published this at all tells you something important: the enterprise AI agent deployment wave has already happened. And most of the organizations that went through it forgot to update their org charts.
The Scale Is Already Here
Gartner published data earlier this year showing that 72% of Global 2000 companies now operate AI agent systems in production. Not pilots. Not proofs-of-concept. Production systems making real decisions, taking real actions, touching real data. Gartner also projects that 40% of enterprise applications will integrate task-specific AI agents by end of 2026 — up from under 5% in 2025.
TELUS Digital is one of the more visible examples of what scale actually looks like. By the end of 2025, they'd processed over 2 trillion AI model tokens through their Fuel iX platform. Employees had organically created more than 53,000 custom AI copilots. Their Agent Trainer product — an AI simulation system for call center training — cut new agent ramp time by 50%.
53,000 copilots. That's not a single team running an AI experiment. That's an organization where the capability to create and deploy AI agents has been democratized across tens of thousands of employees. And if you ask most companies in a similar position who's accountable when one of those agents does something wrong, you'll get a very long pause before you get an answer.
The enterprise deployed AI agents at a pace that organizational accountability could not keep up with. That gap is now a real operational risk.
The Zombie Agent Problem
Here's what the governance failure looks like in practice.
An engineering team spins up a Claude Code session with three MCP servers attached: one for their database, one for their GitHub repositories, one for their internal ticketing system. They configure a scheduled task to run a nightly audit. The project ships. The team moves on to the next thing. The scheduled task keeps running.
Six months later, that team's lead is gone. The credentials the agent was using were tied to her service account. That service account still has production database access. The scheduled task is still running. Nobody knows it's there.
This is the zombie agent problem: agents deployed for a specific context, that context disappearing, and the agents continuing to operate with stale credentials and zero human oversight. At a company running fifty custom copilots, you can track this manually. At TELUS scale — 53,000 — you cannot.
And the problem isn't just orphaned agents. It's that most companies have no inventory at all. Ask your CISO: "What AI agents currently have access to our production systems, what credentials do they hold, who owns each one, and when were they last audited?" If the answer isn't a specific document with specific owners and dates, you have a zombie agent problem whether you know it or not.
What Agent Ops Actually Looks Like
The HBR framing is right — this is a new function, not an expanded version of something that already exists. ML Ops handles model training pipelines and inference infrastructure. DevOps handles deployments, uptime, and CI/CD. Neither covers what happens when an autonomous agent with tool access and a system prompt makes a decision that has downstream consequences across three systems.
Agent operations is the function that answers five questions that neither ML Ops nor DevOps owns:
What agents are currently running in production, and what can each one do?
Who authorized each agent's access, and when does that authorization expire?
When did each agent last receive a human review of its behavior?
What is the escalation path when an agent takes an unexpected action?
How do we deprovision an agent when its context or owner disappears?
TELUS built a governance layer into Fuel iX to answer these questions at scale. The platform serves as the control plane — a single place where agent deployments are registered, permissions are scoped, and usage is observable. That's not a feature they added for enterprise optics. It's a necessity when you're operating at a scale where individual oversight is impossible.
Your Agent Inventory Template
You don't need to be TELUS to start. An agent inventory is a flat document that every team running agents — even one or two — should maintain. Here's the minimum viable structure:
``yaml
agents:
• id: nightly-db-audit
description: Scans production DB for schema drift, posts report to Slack
owner: platform-team@company.com
created: 2025-11-14
last_reviewed: 2026-02-01
credentials:
• type: db_read_only
scope: production
expires: 2026-06-01
• type: slack_webhook
scope: "#platform-alerts"
expires: never
tools:
• database_query
• slack_post
scheduled: "0 2 " # 2am nightly
status: active
• id: pr-reviewer-bot
description: Reviews open PRs for security patterns, leaves comments
owner: security-team@company.com
created: 2026-01-08
last_reviewed: 2026-03-10
credentials:
• type: github_token
scope: repo:read, pr:comment
expires: 2026-07-08
tools:
• github_read_pr
• github_post_comment
scheduled: "/30 " # every 30 min
status: active
`
Each entry costs fifteen minutes to write. The inventory itself — the habit of maintaining it — is what matters. When a team member leaves, you check the inventory and reassign or deprovision. When security asks for an audit, you have a document. When an agent starts behaving unexpectedly, you have a starting point for the investigation.
For teams using Claude Code specifically, auditing your current setup takes three commands:
`bash
Check what MCP servers are configured
cat ~/.claude/settings.json | jq '.mcpServers'
List active scheduled tasks
cat ~/.claude/scheduled-tasks//SKILL.md 2/dev/null sort
Review hook configurations that trigger on agent events
cat ~/.claude/settings.json | jq '.hooks'
``
None of this is sophisticated tooling. It's a starting inventory. The point is to have it, not to automate it perfectly from day one.
The Access Audit Comes Next
Inventory tells you what exists. Access audit tells you what each agent can reach.
The most common governance failure isn't a rogue agent — it's an over-credentialed one. An agent that was given broad access to "make it easier to set up" and never had its permissions scoped down when the setup was complete. In practice, most agents need far less access than they were initially granted, and nobody went back to tighten it.
The audit is straightforward: for each agent in your inventory, list every credential and permission, then ask whether that specific capability is required for the agent's stated purpose. If a nightly audit agent has write access to a database it only needs to read, that's a finding. If a PR review bot has credentials scoped to every repository when it only touches two, that's a finding.
Run this audit quarterly. Put it on the calendar the same way you'd schedule a dependency version audit or a secrets rotation.
You Don't Need to Hire for This Yet
Most engineering organizations don't need a dedicated Agent Manager in 2026. What they do need is someone — a specific human being, not a team, not a committee — who owns the function. That means:
• Maintaining the agent inventory
• Running the quarterly access audit
• Writing and owning the incident runbook for when an agent takes an unexpected action
• Reviewing and approving new agent deployments before they touch production
This is four to six hours per quarter for a team running under twenty agents. It doesn't require a new headcount. It requires that someone has explicit accountability and that accountability is visible to the rest of the organization.
The reason most companies haven't done this isn't complexity — it's that nobody thought to ask the question. AI agent deployment happened incrementally, project by project, team by team, without anyone stepping back to ask who owned the aggregate.
That's the actual gap. Not tooling. Not process. Ownership.
The Bottom Line
HBR naming the Agent Manager role wasn't ahead of its time. It was slightly behind it. The agents are already running. The credentials are already issued. The scheduled tasks are already executing.
The companies that build agent governance functions now — even informally, even with a single owner and a YAML file — will have a significant operational advantage over the ones that wait for a serious incident to force the conversation. TELUS didn't build a governance layer in Fuel iX because they anticipated this problem philosophically. They built it because operating at their scale made the absence of governance immediately painful.
Most companies aren't at 53,000 copilots yet. But the gap between "we have a few agents running" and "we need someone responsible for them" closes faster than you'd expect.
The right time to create your agent inventory is before you need it.
Related: MCP Hit 97 Million Downloads. Your Security Team Hasn't. — on the governance gap in the MCP ecosystem.*