TABLE OF CONTENTS

Making sense of the chaos

So, what's our working definition?

The Mirage of the End-to-End Robot Employee

PART 1

The AI‑Agent Landscape

When I hear ‘AI Agent’ I think opportunity. It’s the future of business … not a fleeting fad.

Nina Butler

Chief of Staff (ex‑Head of Marketing),

Gen‑AI burst onto GTM, Marketing and CX roadmaps in 2023. Eighteen months later the signal‑to‑noise ratio is still low: everything is "AI‑powered", yet few teams can point to an AI agent that delivers real outcomes. Amid this noise, leaders are being asked to bet big, to integrate agents into prospecting, support, forecasting, and more.

But what is an AI agent, really? Where can it work now? And where does human expertise still matter most?

This section helps you:

Understand the spectrum of agentic capabilities: from simple prompt tools to true autonomous workflows.
Cut through the BS and spot when a vendor is overselling a tool’s capabilities.
Clarify the role of human supervision, and why it’s non-negotiable.

What is an AI Agent (Really?)

It’s tempting to label anything powered by Large Language Models (LLMs) as an “AI agent.” But the truth is not every tool that uses AI is an agent, and not every agent is built equal.

Why does this distinction matter?

Because the word “agent” implies a level of initiative and autonomy.
It suggests the software can act — not just respond. But nobody seems to agree on what an agent actually is.

The agent definition dilemma

Silicon Valley is all-in on agents or, at least, the idea of them. Salesforce wants to lead the digital labor market. Microsoft promises agents will replace knowledge work. OpenAI’s Sam Altman says they’ll join the workforce. But peel back the hype, and you’ll find a mess of inconsistent definitions.

Some say agents are systems that “independently accomplish tasks.” Others define them as LLMs equipped with tools. And some use the term loosely for anything with a bit of automation.

We’ve landed on a buzzword with megawatts of branding power, but little shared meaning, and growing confusion for customers.

All of this chaos is because agents — like AI itself — are a moving target. They straddle multiple disciplines: software automation, decision science, human-computer interaction. The technology is evolving. And the label is often shaped more by positioning than by technical capability.

As Jim Rowan, Head of AI at Deloitte explained, the ambiguity is both a feature and a bug. It lets companies tailor agents to their own needs, but it also leads to misaligned expectations and fuzzy ROI.

Making sense of the chaos: A spectrum of agency

David Yockelson at Gartner directs us to simplify this chaos by imagining it in terms of a progression of AI capabilities:

You can write prompts. You can have assistants that take a task and do it. And then you have agents that do a bunch of work on your behalf.

David Yockelson

VP Analyst,

As such, rather than chasing a fixed definition, we’ve found it more useful to think of AI capabilities on a spectrum of agency or independent decision-making. This lets us decode what a tool really does, and what level of agency it offers, instead of getting lost in labels.

First, you have Prompt Tools that are passive: they generate outputs only when asked. They don’t initiate anything or adapt their behavior based on goals.
Then, you have Task Assistants or Copilots that are semi-interactive: they help with small tasks, like drafting replies or summarizing documents, but they never move on their own.
AI Agents start to show initiative: they can monitor for triggers (like an incoming ticket), follow a playbook, and take the next step without being told.
Agentic AI goes furthest: it is a system architecture. It is often built on multiple agents working in coordination, combining awareness, goals, and memory to execute end-to-end workflows.

Prompt Tool

Copilot

AI Agent

Agentic AI

What it is

What It Typically Does

Real-world example

How Much Agency Does It Have?

A tool

Responds to human or system-generated prompts with text or code

Ask ChatGPT for five CTA ideas.

0 % – answers only the question you typed

A helper inside another system

Offers task-level support inside workflows

Gmail’s Smart Compose feature analyses the last email thread and drafts a reply.

Low – waits for you to hit Send

A single software agent

Takes action based on rules, context, or triggers

GitHub’s Copilot suggests code to developers and helps with debugging.

Medium – follows defined workflows within guardrails, but does not invent new ones

A coordinated system of agents + planner/controller

Orchestrates multiple agents; Perceives → reasons → acts → learns in pursuit of a goal

Tesla’s Full Self-Driving system perceives the environment, takes decisions and learns from every trip.

High — operates against a goal and chooses its own sequence of steps

Key distinction

Prompt

copilot

agent

agentic

So, what’s our working definition?

In this guide, when we say “AI agent,” we mean

A semi-autonomous software program that can interpret context, make decisions, and take action within a defined workflow and guardrails, often with minimal prompting.

It doesn’t just respond to prompts; it acts toward a task-level outcome. You give it a goal and a lane, and it moves forward on its own. You still need to define its scope, train it on the task, and monitor results, but it’s no longer waiting passively for a prompt.

Today, most commercial SaaS offerings stop at the Co-pilot layer. For all the marketing fanfare, true agency is still rare. Agentic AI is experimental, and difficult to implement — but it's also where real breakthroughs in autonomy will likely emerge. In the meanwhile, understanding where a tool sits on this spectrum is more than just taxonomy; it’s crucial for choosing, implementing, and trusting AI in your workflows.

The Mirage of the End-to-End Robot Employee (and What to Ask Instead)

Vendors love to sell the fantasy of a digital teammate who can “do it all”. You’ll hear phrases like:

But these promises conflate agentic ambition with current capability.

There’s a huge disconnect between the expectation of the buyer and the capability of this really nascent tech right now.

Nina Butler

Chief of Staff (ex‑Head of Marketing),

Yockelson notes that most tools in the offing still sit on the continuum before true autonomy — they’re prompts, assistants or unattended scripts that execute narrowly but don’t decide independently .

Most things that claim to be agents today are assistants at best — there’s still a lot of agent-washing out there.

David Yockelson

VP Analyst,

Much as “green-washing” once smeared sustainability, you’ll find vendors now slapping "agent" on features that are really just scripts, macros, or a clever autocomplete wrapped in a dashboard. And when teams don’t know what to look for, they walk straight into failed pilots, frustrated users, and unmet ROI.

To help you avoid that, let’s break down the three most persistent myths about AI agents, and what questions you should ask vendors instead.

Busting the AI agent myths

#Myth 1

Agents can replace entire teams.

This is the myth that really sells itself:
Why scale headcount when you can scale software?
It paints a picture of a fully autonomous digital employee that replaces an entire function with zero burnout, no attrition, and infinite scale.

Reality-check

What AI agents actually do well is task-level execution, not full-role substitution. They can handle structured, repeatable, rules-based tasks.

Agents excel at what Nina Butler calls “left-brain tasks”.

Think about the tasks that you are looking to deploy an agent against and make sure that the agent is being put against what I call left-brain oriented tasks. These are things that are highly analytical, repetitive, monotonous, require high degrees of precision and accuracy. Those are the right tasks to put an agent on today. But tasks that are more right brain oriented - spontaneity, empathy, finesse - do not put an agent on those.

Nina Butler

Chief of Staff (ex‑Head of Marketing),

This is the nuance that gets lost in marketing speak. Agents are great at list building, data processing, and predictable workflows. They don’t carry judgment, context-switching, or emotional nuance — the things that make roles roles.

That’s why Greg Baumann is skeptical of the “replacement” narrative. His team at Outreach uses AI to enhance performance, not to remove the human:

Agents will increase your capacity, not do your job for you. If you could manage five reps, maybe now you can manage eight or twelve. If you worked eight accounts, maybe now you can do fifteen. But the job doesn't go away.

Greg Baumann

Sr Director of Sales,

In practice, this means you’re getting a task-level sidekick. The agent accelerates pieces of the job, but it doesn’t own the whole thing.

And even when the tech evolves further, adoption and trust will still lag behind capability, especially in human-facing roles.

It’s not that agents can’t do these things; but I’m not sure people are going to want them to. We still want to interact with a person in many cases. We might get to a fully agentic future, but even if the tech gets there, adoption and trust are a whole other story.

Ori Entis

SVP Product CS & AI

The productivity gains are real, but so is the boundary. You’re not hiring an AI teammate to replace a human; you’re bringing one in to amplify human output in very specific, defined contexts.

Vendor Questions

What to Ask Instead

If a vendor implies team-level replacement, challenge the claim:

Workflow ownership

Which tasks does the agent handle end-to-end? Where must a human step in?

Decision limits

At what point does the agent hit a “judgment wall”?

Production readiness

What data, training, or prompt engineering is required?

Accuracy and failure modes

What failure rate can you expect, and how are errors surfaced?

Accountability

If the agent makes a mistake, who owns the outcome—your team or the vendor?

#Myth 2

One agent can do everything.

This myth is not always framed as hype; sometimes it sounds perfectly reasonable. You’re promised multi-talented agents that can handle prospecting, scheduling, analytics, campaign execution, even customer support.

It’s the AI agents version of a Swiss Army knife - a single tool that can span departments, tools, and tasks.

Reality-check

Trying to make one agent do everything usually leads to one of two outcomes: shallow results or brittle systems. Agents work best when they’re purpose-built for a job — and often break down when stretched across too many workflows.

Greg Baumann frames this as a hiring question:

You wouldn't hire Greg as a worker; you would hire me to do a specific job. So think about that: what is the worker's job? We don’t trust AI agents in part because we haven’t clearly defined what they’re supposed to do.

Greg Baumann

Sr Director of Sales,

Just like you don’t expect a single person to be your data analyst, sales rep, and campaign manager, you shouldn’t expect a single agent to do it all.

This is where the horizontal, vertical, and bespoke agent classification comes in handy:

Agent Type

Description

What it’s built for

Tradeoffs

Horizontal

Vertical

Bespoke

General-purpose agents that span functions

Domain-specific agents trained for particular use cases

Custom-built agents for a company’s unique data/workflows

Broad, general-purpose tasks across many domains

Domain-specific tasks in sales, support, etc.

Workflows based on your team’s exact needs

Shallow domain understanding, brittle logic

Don't scale to unrelated workflows

Require upfront work, expensive to build, slow to scale

Agent Type

Description

What it’s built for

Tradeoffs

Horizontal

Vertical

Bespoke

General-purpose agents that span functions

Domain-specific agents trained for particular use cases

Custom-built agents for a company’s unique data/workflows

Broad, general-purpose tasks across many domains

Domain-specific tasks in sales, support, etc.

Workflows based on your team’s exact needs

Shallow domain understanding, brittle logic

Doesn’t scale to unrelated workflows

Requires upfront work, expensive to build, slow to scale

Once you’ve scoped the kind of agent you need, the next decision is whether to build it yourself or buy from a vendor. Nina Butler provides an in-road here:

Depending on how nuanced a problem you're trying to solve, you may be better off going with a vendor who’s already had a leg up solving it — versus you stumbling around in the dark trying to build it yourself.

Nina Butler

Chief of Staff (ex‑Head of Marketing),

Put simply: don’t ask if one agent can do it all. Instead, ask: who’s already solved this well, and how much do we need to tailor it for our workflow?

Vendor Questions

What to Ask Instead

If a vendor claims a “do-everything” agent, press for clarity:

Agent classification

Which type is this—horizontal, vertical, or bespoke—and what domain training or custom data underpins that choice?

Proven workflows

What specific tasks or departments has this agent been deployed in, and can you share real performance metrics or case studies?

Failure modes & brittleness

How does logic break or degrade when you stretch into adjacent workflows? What error rates should you expect?

Integration & extension

Which systems come supported out of the box, and what connectors or prompt engineering will your team need to build?

Maintenance & roadmap

How are updates and breaking changes handled? What support, SLAs, or consulting come with ongoing customization?

#Myth 3

Agents are plug-and-play.

This is one of the most pervasive assumptions: that agents are “smart” out of the box, and they’ll just figure things out.

No onboarding, no configuration required.
Especially in vendor demos, where tasks appear to flow seamlessly and outputs look production-ready, it’s easy to assume you can buy an AI agent, drop it into your tech stack, and it’ll immediately start delivering results.

Reality-check

AI agents require structure, context, and supervision.

Murali Kandasamy flags this myth as one of the biggest traps teams fall into:

Everybody will come and say, yes, it’s a plug-and-play. That for me is a big thing. There is a huge amount of foundational work that goes into this. And we are completely misunderstanding that (…); people are not really sure how far the guardrails need to be extended or restricted.

Murali Kandasamy

VP of Strategy,

Before you deploy any agent, you need structured data, clear workflows, and defined boundaries for what the agent can and can’t do. Otherwise, you’re flying blind.

Derrick Arakaki underscores this with a reminder that AI success begins before you write a single prompt:

You really gotta understand the activities of your team. Taking in an AI agent won’t solve anything unless you know what’s going to move the needle there.

Derrick Arakaki

Customer Success Director,

And even when the basic structure exists, the agent’s performance is still only as strong as the data and logic behind it:

It’s not a panacea. My deployment of agents doesn’t necessarily mean my business works better. You need to look at how your business processes operate (…) knowing that garbage processes in still mean garbage outcomes. (AI agents are) only going to be as good as what you tell them to do.

David Yockelson

VP Analyst,

Ori Entis agrees, pointing out how fragile things are without the right supervision and safety nets:

A lot of the architecture right now with agents is trying to take into consideration how reliable they are. You need to wrap the agent with deterministic code, or limit the tools they can use, or restrict behaviors that could lead to a mistake.

Ori Entis

SVP Product CS & AI,

This is especially true when agents are deployed in customer-facing contexts like outreach, support, or sales calls. The cost of failure isn’t just operational; it’s reputational.

You put an agent on that cold call for example, you could completely burn your brand’s reputation if it starts to say nonsensical things to the prospect on the receiving end of the phone. Think about your risk-reward in the context of the job that needs to be done. Where are you willing to have the imperfections?

Nina Butler

Chief of Staff (ex‑Head of Marketing),

That’s why the real work of deploying agents isn’t just in buying or building them. It’s in training, tuning, validating, and managing them over time.

Vendor Questions

What to Ask Instead

Before believing the plug-and-play story, ask vendors:

Required data inputs & configurations

What data sources, schema mappings, or environment setups must be in place before this agent delivers value?

Guardrails & confidence thresholds

Which safety checks, probability cut-offs, or rule-based constraints does the agent enforce—and can you adjust them?

Incomplete or unstructured data

How does the agent ingest and interpret missing fields, free-form text, or noisy datasets?

Feedback loops & auditability

What pathways exist to feed corrections back into the model? Can you inspect its decision logic or audit its outputs over time?

Human escalation

Is there a built-in workflow to flag uncertain or high-risk cases for human review, and how seamless is that handoff?

Next up, we’ll zoom into the four workflow buckets where today’s AI agents already prove their worth, and map the ‘skill stack’ you should look for when you’re ready to “hire” your first AI agent.

View Part 2