Give Your Agents Jobs, Not Processes

Geoff Stearns · April 6, 2026

A middle manager in an office holding a coffee mug, about to explain exactly how he'd like the TPS reports done — "Yeah, if you could go ahead and just follow the script, that'd be great."

Every experienced software builder carries the same instinct: start small. Ship the thinnest possible slice, prove value, then broaden from there. Lean startup, agile, MVP, iterative delivery. It’s so deeply embedded in product thinking that it feels like a law of nature.

When building AI agents, that instinct can backfire if “start small” means “hardcode today’s process.” If you focus on replicating how the work gets done now, step by step, you constrain a system that can reason about goals into a brittle script. The result is an agent that breaks on edge cases and never shows what it could have done with more freedom.

When you hire a senior person into a role, you don’t hand them a script of every action to take in every situation. You describe the role, give them access to the systems they need, set clear boundaries on their authority, and trust them to figure out the best approach. The more capable the hire, the more counterproductive it is to micromanage their every step. You wouldn’t tell a senior engineer which IDE to use or what order to write their functions in. You’d tell them what the system needs to do, point them at the codebase and the deployment pipeline, and make sure they understand what requires a code review.

Many teams building AI agents are doing the equivalent of handing that senior hire a keystroke-by-keystroke script, and getting exactly the results you’d expect.

How we got here

For most of the history of software, building a tool to support a workflow meant decomposing that workflow into discrete operations and implementing each one. Software didn’t reason; it executed. If you wanted the system to handle a new situation, you wrote new code. Starting narrow meant picking one path through the workflow, building it end to end, getting feedback, and expanding. This is good product discipline, and it still applies to many things. But it carries an implicit assumption: every capability has to be constructed.

LLMs break that assumption. For the first time, the technology can reason about a goal, choose among available tools, and adapt its approach based on what it encounters. You’re no longer assembling every behavior from scratch; you’re shaping a system that arrives with broad capability already present. And yet many teams still approach AI agents the way they’d approach any workflow tool: decompose the process, implement each step, start narrow, iterate.

Anthropic draws a useful distinction between “workflows” (LLMs orchestrated through predefined code paths) and “agents” (systems where LLMs dynamically direct their own processes and tool usage). When you encode an existing process into a workflow, you’re encoding the limitations that process was designed around. The process was built for software that couldn’t reason or adapt. Those constraints shaped every part of it: the handoffs, the approval gates, the workarounds for systems that don’t talk to each other. Carrying all of that forward into an agent is like hiring that senior engineer and then requiring them to write code the way your 2005 intern did, because that’s what’s documented.

And when you constrain an agent to execute the existing process, the best you’ll ever get is an incremental improvement on the current way of working. The agent does the steps faster, but it never shows you a better way to accomplish the goal, because you never gave it the latitude to find one.

Define the job, not the steps

The Jobs to Be Done framework maps onto AI agent design with unusual precision. JTBD says: don’t ask what features the product needs, ask what job the user is hiring it to do. The job is stable even when the process changes. “Move qualified deals through the pipeline to close” is a job. “Query the CRM, filter by stage, check last activity date, draft a follow-up” is a process. The job persists; the process is an artifact of the tools and constraints that existed when someone first figured out how to get it done.

When you give an agent a job instead of a process, two things happen. First, the agent can reason about how to accomplish the goal, which means it can handle situations the original process never anticipated. A scripted agent fails silently on edge cases; a job-oriented agent can adapt. Second, you’re forced to articulate what success actually means. A process is self-describing: did the agent follow the steps? A job requires you to define the outcome. Is the deal actually qualified, or just at the right stage? Has the prospect been contacted recently by another rep? Does the proposed next step match where the buyer actually is in their decision? These are harder questions than “did step 3 execute,” but they’re the questions that determine whether the work was actually done well.

JTBD turns out to be a natural way to specify agents. Jobs are stable. Processes are contingent. A good agent spec should look more like an intention with success criteria and constraints than a flowchart.

The risk question

The biggest practical obstacle to the job-oriented approach is trust. If you haven’t scripted each step, how do you know the agent isn’t doing something you didn’t intend?

This concern is real, but it’s aimed at the wrong thing. Scripted processes feel safe because they’re legible: you can read the instructions and predict what will happen. But legibility is not the same thing as safety. A scripted agent that encounters a state the process didn’t anticipate will either halt with an error that can’t describe the actual problem, or continue executing steps that no longer make sense in context. The failure is silent precisely because the agent is doing exactly what you told it to.

The resolution requires separating two things that feel like the same thing but aren’t: scope and authority. Scope is what the agent can think about and attempt. Authority is what the agent can actually do in external systems.

Restricting scope makes the agent less capable. Restricting authority makes the agent safer.

Conflating them means you’re making the agent dumber when you’re trying to make it more trustworthy. There is a legitimate tradeoff in debuggability: a scripted process gives you step-by-step traceability that a reasoning agent doesn’t. But well-designed tools produce structured call traces that function as checkpoints, and there’s more to say about evaluation for agent systems in a follow-up.

Bound the authority, broaden the scope

The practical pattern that’s emerging in production is to let the agent reason freely about how to accomplish a job while forcing every action through a governance layer that enforces organizational constraints deterministically. The agent’s intelligence creates value in the planning and reasoning. The boundaries create safety at the point of execution.

In practice, this breaks down into two layers.

Tools encode requirements. The tools you give the agent are your authority boundary, and they should enforce security, process, and policy constraints directly. If your organization requires that prospects aren’t contacted more than once per quarter, that constraint lives in the tool, not in the prompt. The tool either accepts the request or rejects it with a reason. This is deterministic enforcement: your compliance story doesn’t depend on whether the LLM interpreted a natural-language instruction correctly on this particular inference. This is also where concerns about hallucination find their architectural answer. An agent might hallucinate a bad plan, but it can’t hallucinate its way past a tool that requires valid parameters and checks policy before executing. The reasoning layer can be imperfect without producing dangerous actions, because the execution layer doesn’t trust it. Well-designed tools also enable collaboration by managing handoffs between people and agents, protecting shared resources like contact lists or approval queues, and producing structured outputs that downstream systems and teammates can act on.

Skills encode goals and quality. Skills are reusable context modules: domain-specific operating instructions that extend the system prompt for a given scenario, providing the agent with specific goals, expertise, and quality standards. A skill for deal management would include the goal (“move this deal to a decision by identifying and addressing the buyer’s remaining concerns”), the domain knowledge (what signals indicate a deal is stalling, how outreach should differ between a technical evaluator and an executive sponsor, when to escalate to a human), and the quality bar (what “good enough” looks like versus what requires judgment from a person). Skills enable progressive disclosure: as the agent encounters different types of work, the relevant skill activates with context and expectations rather than front-loading everything into a monolithic prompt. They should also explain the reasoning behind the policies enforced by tools, so the agent makes better decisions about when and how to use them.

This separation matters because you can update policy and expertise independently. When a compliance rule changes, you change the tool. When the team learns that a certain approach produces better results, you update the skill. Neither change requires rearchitecting the other.

Mapping back to the senior hire: tools are the systems and permissions you provision on their first day. Skills are the onboarding docs, the team norms, the goals they’re working toward. You wouldn’t put access controls in an onboarding document, and you wouldn’t put quality standards in an IAM policy. The same logic applies here.

The builder’s job has changed

The instinct to start narrow comes from a world where you were constructing capability from scratch. In that world, narrowing scope was how you managed complexity and risk. With AI agents, broad capability is the starting condition. The builder’s job isn’t to assemble the capability piece by piece. It’s to design the tools that constrain authority, the skills that define goals and encode quality, and the governance layer that makes the whole thing auditable and trustworthy. That’s the product.

Geoff Stearns builds things on the internet and is a Product Manager at Google.