Evaluating AI ROI: Your agents are costing you more than you think

Learn essential AI cost management techniques to evaluate AI ROI and ensure your agents drive real business value.

By Kesha Williams

Jun 10, 2026 • 6 Minute Read

Please set an alt value for this image...

Table of Contents

Why AI ROI gets fuzzy so quickly
AI cost management: Expensive, agentic problems aren't always obvious
Why this keeps happening: Lack of process prevents AI operational efficiency
Understanding AI ROI: What leaders should measure first
Wrapping up: AI readiness requires visibility, accountability, and performance

Many organizations already have AI agents running in their cloud environments, but they don’t know what those agents are doing or who owns the outcome. Both of these problems feed directly into a third issue that tends to get leadership attention faster than anything else: cost.

I spend a lot of time talking with enterprise leaders who are investing heavily in AI. Cloud computing costs and budgets are going up. AI-specific spending is going up. Teams are adding people, tools, and platforms to support agent development and deployment. But when I ask whether they can tie that growing investment to a measurable business outcome, the answer is often some version of, “We’re still working on that.”

Sometimes, that’s fair. AI adoption has a learning curve, and early investments don’t always yield immediate returns. But there’s a real difference between being early in the journey and leaving agents running unchecked in production for months. That is a cost problem.

Why AI ROI gets fuzzy so quickly

The organizations I see struggling with AI ROI are usually not struggling because the technology itself failed. More often, they’re struggling because nobody assessed whether the agent is still driving the business outcome it was supposed to support.

Here are some examples:

An agent may have been deployed to reduce ticket resolution time. Did that actually happen?
Another may have been rolled out to automate document review. Is the team consistently moving faster because of it?
Another may have been introduced to improve the quality of customer responses. Has anyone checked whether the responses are actually better, more accurate, or more useful?

Once a system is live, many organizations don't ask these questions. The agent keeps running. The costs keep showing up. Leadership sees the budget line grow, but the value story gets harder to explain. That is usually when broader doubts about AI investment start to surface.

In many cases, the investment itself isn't a mistake. It's just that no one stayed close enough to the system to determine if it's still worth what it costs.

AI cost management: Expensive, agentic problems aren't always obvious

When it comes to AI cost management, model inference is usually the first thing that comes to mind. People also think about API usage, storage, and compute. Those matter and are easy to spot. But the costs that go unnoticed are typically the ones surrounding agents.

Rework

One of the biggest costs is rework. If an agent produces poor output and a human has to redo the task, the organization has paid for both compute and human labor. It also lost the efficiency gain that the agent was supposed to create.

If that happens often enough, people stop trusting the system and start working around it. At that point, the organization is paying to run a capability that no longer improves the workflow it was meant to support.

Downstream mistakes

There is also the cost of downstream mistakes. When systems are connected, one bad output can trigger additional work across downstream steps. The cost of tracing the error, mitigating the impact, and restoring trust can be far greater than the original issue. Engineering time gets pulled into investigation and cleanup rather than being spent on new development.

Overly broad access

Another issue is overly broad access. If agents are connected to tools and data sources through MCP (and those connections haven’t been reviewed in a while), the organization may be paying for calls, transfers, and interactions that don’t support any meaningful business outcome.

This is easy to miss because it often looks like normal activity until someone takes the time to ask whether the agent should still have that access at all.

Cost of inaction

Then there’s the cost of inaction. This one shows up more often than many teams realize. In this case, an agent was useful at one point, but now the business process has changed, and the agent was never retired. It’s still running, consuming resources, and showing up in the environment, even though it no longer creates value.

Why this keeps happening: Lack of process prevents AI operational efficiency

The pattern is usually pretty predictable. A team builds an agent to solve a real problem. The agent works well enough to move the initiative forward. People feel good about the progress and shift their attention to the next thing. The agent is never reevaluated in a disciplined way.

This is where the problem compounds. Nobody planned for the system to become stale, but nobody built in a regular process to keep it current, either. There was no scheduled review. No clear owner responsible for checking whether the agent still makes sense. No habit of measuring the agent’s activity against the business outcome it was originally deployed to support.

That is the gap I keep seeing. Organizations are getting better at deploying agents. But they’re still learning how to manage them as ongoing investments that need regular review, just like any other production system.

Understanding AI ROI: What leaders should measure first

If you’re a leader looking at your AI spend and wondering whether you’re getting the return you expected, I would work through this in three levels.

Level 1: Visibility

The first is visibility. If you can’t answer these questions, measurement will be messy from the start because you don’t have a clear picture of what you’re evaluating.

Do you know what agents are running in your environment?
Do you know what each one can do, what tools it can call, and what data it can access?

Level 2: Accountability

The second is accountability. If accountability is weak, you may be able to identify waste, but you will struggle to fix it because nobody truly owns the fix.

Does every agent have a defined human owner?
Does that person understand what the agent is supposed to deliver and where its boundaries are?
Are MCP connections and A2A interactions being reviewed with enough rigor?
Is there monitoring that tells you what the agent is actually doing, not just whether the infrastructure is healthy?

Level 3: Performance

The third is performance. This is the level where ROI becomes real, and it’s also the level that many organizations have not yet operationalized.

Have you measured whether the agent is actually improving the business outcome it was deployed to support?
Do you have feedback loops that allow the system to be adjusted over time?
Do you have a process for updating, pausing, or retiring agents that are no longer delivering value?

Next steps

The good news is that this doesn’t have to start as a massive initiative. In many cases, the best place to begin is with one agent. Walk through these three levels with a single system and see what you find. That first review often reveals how much remains unknown, which usually creates the momentum to take a broader look across the environment.

Wrapping up: AI readiness requires visibility, accountability, and performance

Getting value from agentic AI involves three connected challenges: visibility, accountability, and performance. Each one builds on the one before it. You cannot hold agents accountable if you do not know what is running. You cannot measure performance if nobody owns the outcome. And you cannot make a strong ROI case for AI if those first two pieces are still missing.

To help, I created a Hybrid Team Readiness Checklist that brings everything together into a single assessment. The checklist covers agent inventory, ownership, monitoring, security around MCP and A2A, and business outcome measurement. It helps leaders move to something concrete and operational.

If you want to go deeper on this, I cover the readiness checklist and the full model in my Pluralsight webinar, Humans and AI Agents, Better Together: A Leadership Model for Hybrid Teams.

This session goes further into how to design hybrid workflows, build useful feedback loops, and create governance that holds up as organizations move from single-agent systems to more connected ones.

Watch the webinar now

Kesha W.

Kesha Williams is an Atlanta-based AWS Machine Learning Hero and Senior Director of Enterprise Architecture & Engineering. She guides the strategic vision and design of technology solutions across the enterprise while leading engineering teams in building cloud-native solutions with a focus on Artificial Intelligence (AI). Kesha holds multiple AWS certifications and has received leadership training from Harvard Business School. Learn more at https://www.keshawilliams.com/.

More about this author