The New Bottleneck in Agentic Engineering

I’ve been building a lot of side projects lately, with the help of agents. All these side projects that I never had the time for, suddenly, are a just a few prompts away from being a reality.

This weekend I was working on a tool for my wife, who’s a social media content creator, and I decided to optimize my iteration cycle a bit. The result:

3:02 PM. My wife sends a piece of feedback through a button in the app I built for her.

3:04 PM. A Cursor agent has read the feedback, found the relevant code, and opened a PR.

3:06 PM. The preview deploy is live.

3:08 PM. I approve. It ships.

Six minutes from “this is broken” to “this is fixed.” And I didn’t even write any code.

The cost of writing code is going to zero

Last month, Bassim Eledath published The 8 Levels of Agentic Engineering. It’s the cleanest progression I’ve seen for thinking about how teams adopt AI coding. Level 6 is “Harness Engineering” — the part where you stop thinking about the agent and start thinking about everything around it. Almost no one is there yet.

This isn’t controversial anymore. Cursor, Claude Code, Codex, whatever you use — the part of your job that used to be “translate intent into working code” has collapsed from days to minutes for a growing share of tasks.

Most people respond to this by asking how to make the agent better. Better prompts, better context, better tool calls.

I’d like to argue that this is no longer what you should be focusing on.

When the cost of execution drops to near-zero, the value moves to whatever didn’t drop. And the part that didn’t drop is the part before the prompt: figuring out what’s actually worth building, hearing what users actually need, and getting that signal in front of the agent fast enough that it matters.

That’s the new bottleneck.

Where the days actually go

Think about the last user-reported issue your team shipped a fix for. Walk through where the time went.

A user hits a bug. Most of them never tell you. The ones who do report a vague version (”the page is broken”) with no screenshot, no URL, no context about what they were trying to do. The report sits in a queue until a human triages it. The human translates the vague report into something an engineer can act on, usually by pinging the user and waiting hours for a reply. The engineer debugs without access to the original session, the traces, or the logs. They write the fix. They open a PR. It gets reviewed. It ships.

Total wall-clock time: usually a week. Total time spent actually changing code: minutes.

What the new loop looks like

Before, when working on this tool for my wife, my iteration loop was the following:

I would send her a new version of the tool, and eventually she’d hit a bug. I’d ask her to walk me through it, sometimes I’d go to her desk, sometimes she’d text me an error message. If I wasn’t around, the bug just sat there. Cycle time was hours to days.

So I built a feedback button in the app. She clicks it, types what’s wrong, and a screenshot of the current page is captured automatically. The submission lands in a queue with the URL, the screenshot, recent app state, and a timestamp. A Cursor agent watches the queue. When something lands, it reads the feedback, finds the relevant code, and opens a PR. Vercel auto-builds a preview. I get a notification, I check the preview, and (hopefully) I approve.

The same bug that used to take a day now takes minutes.

The implication

If you accept the premise — the bottleneck has moved from generating code to getting feedback to the generator — then a few things follow. The one I keep coming back to:

The moat shifts from codebase to iteration speed.

For a long time, the answer to “what makes a software company hard to compete with” was some combination of: the code we wrote, the systems we designed, the people we hired who can write more code. That’s been the moat since software was invented.

Now the cost of code is close to zero. The codebase isn’t the moat. What’s left is how fast you can iterate on what users actually want, which means how short your loop is from signal to ship.

A four-person team with a tight feedback loop ships features users want faster than a forty-person team with a five-day triage queue. The forty-person team has more code. The four-person team has more fit. In a world where code is cheap, fit wins.

This generalizes past startups. If you’re at a big company and your team’s path from “user filed a ticket” to “fix is in production” still goes through three handoffs, a sprint planning meeting, and a JIRA grooming session, your competitors who skipped all of that are going to ship past you.

What I haven’t figured out

Feedback loops are just one part of this new paradigm. Here are some things I’m still thinking about:

Regression: when the agent ships fixes faster than humans can review, how do you catch the ones that quietly break something else? My current answer is “I’m the only user, so I notice.” That doesn’t scale.

Triage: when 50 feedback items come in at once, which does the agent prioritize? Right now I just FIFO. A real product needs better.

Trust: getting a user to click a feedback button is its own UX problem. Most people just leave. The button has to be impossibly low-friction or it doesn’t matter how good your loop is downstream.

Spam: in a real product, someone will eventually figure out they can drive automated PRs by submitting fake feedback. I haven’t thought about this at all.

These are real problems. I think they’re solvable. They’re also not the reason most teams haven’t built this — most teams haven’t built this because they’re still optimizing the agent.

Stop thinking about the agent

Bassim Eledath calls the level I’ve been describing Harness Engineering — level 6 of 8 in his progression — and he’s right that almost no one is operating there yet. The conversation about agentic engineering is still almost entirely about the agent. Which model, which framework, which prompt, which tools. That’s the part everyone can see, so that’s the part everyone optimizes.

But the agent is the cheap part now. It’s the easy part. Frontier model APIs are a credit card away. Cursor and Claude Code are off-the-shelf.

What’s hard, and what almost no one is building, is everything around the agent. The feedback intake. The trace pipeline. The preview deploys. The queue. The trust layer between user signal and shipped code.

The teams that figure this out are set for success.