Monday, March 23, 2026

The Legendary Agent-Month – O’Reilly

The next article initially appeared on Wes McKinney’s weblog and is being republished right here with the writer’s permission.

Like lots of people, I’ve discovered that AI is horrible for my sleep schedule. Previously I’d get up briefly at 4:00 or 4:30 within the morning to have a sip of water or use the lavatory; now I’ve bother going again to sleep. I might be doing issues. Earlier than I might get a stable 7–8 hours an evening; now I’m fortunate once I get 6. I’ve largely stopped combating it: Now once I’m rolling round restlessly in mattress at 5:07am with concepts to feed my AI coding brokers, I simply rise up and begin my day.

Amongst my inside circle of engineering and information science pals, there’s numerous dialogue about how lengthy our aggressive edge as people will final. Will having good concepts (and plenty of them) nonetheless matter because the brokers start having higher concepts themselves? The human-expert-in-the-loop feels important now to get good outcomes from the brokers, however how lengthy will that final till our wildest concepts could be changed into working, tasteful software program whereas we sleep? Will it’s a mild obsolescence the place we fortunately hand off the reins or one thing else?

For now, I really feel wanted. I don’t describe the way in which I work now as “vibe coding” as this appears like a pejorative “immediate and chill” manner of constructing AI slop software program initiatives. I’ve been constructing instruments like roborev to deliver rigor and steady supervision to my parallel agent periods, and to closely scrutinize the work that my brokers are doing. With this radical new manner of working it’s arduous to not be contemplative about the way forward for software program engineering.

Most likely the e book I’ve referenced probably the most in my profession is The Legendary Man-Month by Fred Brooks, whose now-famous Brooks’s legislation argues that “including manpower to a late software program undertaking makes it later.” These days I discover myself asking whether or not the teachings from this e book are relevant on this new period of agentic improvement. Will a proficient developer orchestrating a swarm of AI brokers be capable of construct complicated software program quicker and higher, and can the short-term productiveness good points result in long-term undertaking success? Or will we run into the identical bottlenecks—scope creep, architectural drift, and coordination overhead—which have plagued software program groups for many years?

Revisiting The Legendary Man-Month (TMMM)

Certainly one of Brooks’s central arguments is that small groups of elite folks outperform massive groups of common ones, with one “chief surgeon” supported by specialists. This results in a excessive diploma of conceptual integrity in regards to the system design, as if “one thoughts designed it, even when many individuals constructed it.”

Agentic engineering seems to amplify these issues, because the high quality of the software program being constructed is now solely pretty much as good because the people within the loop curating and refining specs, saying sure or no to options, and taming pointless code and architectural complexity. One of many metaphors in TMMM is the “tar pit”: “Everybody can see the beasts struggling in it, and it appears to be like like all one in all them may simply free itself, however the tar holds all of them collectively.” Now, we’ve got a brand new “agentic tar pit” the place our parallel Claude Code periods and git worktrees are engaged in fight with the code bloat and incidental complexity generated by their digital colleagues. You’ll be able to systematically refactor, however invariably an agentic codebase will find yourself bigger and extra overwrought than something constructed by human hand. That is technical debt on an unprecedented scale, accrued at machine pace.

In TMMM, Brooks noticed {that a} working program is possibly 1/ninth the way in which to a programming product, one which has the mandatory testing, documentation, and hardening towards edge instances and is maintainable by somebody aside from its writer. Brokers are actually making the “working program” (or “appears-to-work” program, extra precisely) an ideal deal extra accessible, although many newly minted AI vibe coders clearly underestimate the work concerned with going from prototype to manufacturing.

These issues compound when contemplating the closely-related Conway’s legislation, which asserts that the structure of software program methods tends to resemble the organizations’ crew or communication construction. What does that appear to be when utilized to a digital “crew” of brokers with no persistent reminiscence and no shared understanding of the system they’re constructing?

One other “massive thought” from TMMM that has caught with folks is the n(n-1)/2 coordination drawback as groups scale. With agentic engineering, there are fewer people concerned, so the coordination drawback doesn’t disappear however slightly modifications form. Completely different agent periods might produce contradictory plans that people need to reconcile. I’ll go away this agent orchestration query for an additional put up.

No silver bullet

“There isn’t a single improvement, in both expertise or administration method, which by itself guarantees even one order-of-magnitude enchancment inside a decade in productiveness, in reliability, in simplicity.”
—“No Silver Bullet” (1986)

Brooks wrote a follow-up essay to TMMM to have a look at software program design by way of the lens of important complexity and unintentional complexity. Important complexity is prime to attaining your purpose: Should you made the system any less complicated, it will fall in need of its drawback assertion. Unintentional complexity is every part else imposed by our instruments and processes: programming languages, instruments, and the layer of design and documentation to make the system comprehensible by engineers.

Coding brokers are in all probability probably the most highly effective software ever created to deal with unintentional complexity. To assume: I principally don’t write code anymore, and now write tons of code in a language (Go) I’ve by no means written by hand. There’s numerous dialogue about whether or not IDEs are nonetheless going to be related in a 12 months or two, when possibly all we’d like is a textual content editor to overview diffs. The productiveness good points are huge, and I say this as somebody burning north of 10 billion tokens a month throughout Claude, Codex, and Gemini.

However Brooks’s “No Silver Bullet” argument predicts precisely the issue I’m experiencing in my agentic engineering: The unintentional complexity is not any drawback in any respect anymore, however what’s left is the important complexity which was at all times the arduous half. Brokers can’t reliably inform the distinction. LLMs are extraordinary sample matchers skilled on the whole thing of humanity’s open supply software program, so whereas they’re sensible at coping with unintentional complexity (refactor this code, write these checks, clear up this mess), they wrestle with the extra delicate important design issues, which regularly don’t have any precedent to sample match towards. In addition they typically are inclined to introduce pointless complexity, producing massive quantities of defensive boilerplate that’s hardly ever wanted in real-world use.

Put one other manner, brokers are so good at attacking unintentional complexity that they generate new unintentional complexity that may get in the way in which of the important construction that you’re making an attempt to construct. With a few my new initiatives, roborev and msgvault, I’m already coping with this drawback as I start to succeed in the 100 KLOC mark and watch the brokers start to chase their very own tails and contextually choke on the bloated codebases they’ve generated. Sooner or later past that (the subsequent 100 KLOC, or 200 KLOC) issues begin to collapse: Each new change has to hack by way of the code jungle created by prior brokers. Name it a “brownfield barrier.” At Posit we’ve got seen brokers wrestle way more in 1 million-plus-line codebases comparable to Positron, a VS Code fork. This appears to help Brooks’s complexity scaling argument.

I might hesitate to put a guess on whether or not the current is a ceiling or a plateau. The fashions are clearly getting higher quick, and the issues I’m describing right here might look charmingly quaint in two years. However Brooks’s important/unintentional distinction offers me some confidence that this isn’t simply in regards to the present limitations of the expertise. Determining what to construct was the arduous half lengthy earlier than we had LLMs, and I don’t see how a flawless coding agent modifications that.

Agentic scope creep

When producing code is free, figuring out when to say “no” is your final protection.

With the price of producing code now converging to zero, there’s virtually nothing stopping brokers and their human taskmasters from pursuing all avenues that will have beforehand been value or time prohibitive. The temptation to spend your day prompting “and now are you able to simply…?” is overwhelming. However any new generated characteristic or subsystem, whereas low cost to create, isn’t costless to take care of, check, debug, and purpose about sooner or later. What appears free now carries a future contextual burden for future agent periods, and every new bell or whistle turns into a brand new vector of brittleness or bugs that may hurt customers.

From this angle, constructing nice software program initiatives possibly by no means was about how briskly you possibly can kind the code. We are able to “kind” 10x, possibly 100x quicker with brokers than we may earlier than. However we nonetheless need to make good design selections, say no to most product concepts, preserve conceptual integrity, and know when one thing is “carried out.” Brokers are accelerating the “simple half” whereas paradoxically making the “arduous half” probably much more tough.

Agentic scope creep additionally appears to be actively destroying the open supply software program world. Now that the bar is decrease than ever for contributors to leap in and provide assist, initiatives are drowning in torrents of three,000-line “useful” PRs that add new options. As builders turn into more and more hands-off and disengaged from the design and planning course of, the brokers’ runaway scope creep can get uncontrolled rapidly. When the individual submitting a pull request didn’t write or absolutely learn the code in it, there’s seemingly nobody concerned who’s actually accountable for the design selections.

I’ve seen in my very own work on roborev and msgvault that brokers will suggest overwrought options to issues when a easy resolution would do exactly nice. It takes judgment to know when to intervene and easy methods to maintain the agent in examine.

Design and style as our final foothold

Brooks’s argument is that design expertise and good style are probably the most scarce sources, and now with brokers doing all the coding labor, I argue that these abilities matter extra now than ever. The bottleneck was by no means fingers on keyboards. Now with the brand new “Legendary Agent-Month,” we are able to fairly conclude that design, product scoping, and style stay the sensible constraints on delivering high-quality software program. The builders who thrive on this new agentic period gained’t be those who run probably the most parallel periods or burn probably the most tokens. They’ll be those who’re capable of maintain their initiatives’ conceptual fashions of their thoughts, who’re shrewd about what to construct and what to go away out, and train style over the large quantity of output.

The Legendary Man-Month was revealed in 1975, greater than 50 years in the past. In that point, loads has occurred: large progress in {hardware} efficiency, programming languages, improvement environments, cloud computing, and now massive language fashions. The instruments have modified, however the constraints are nonetheless the identical.

Possibly I’m making an attempt to justify my very own continued relevance, however the actuality is extra complicated than that. Not all software program is created equal: CRUD enterprise productiveness apps aren’t the identical as databases and different important methods software program. I believe the median software program consulting store is totally toast. However my thesis is extra about improvement work within the 1% tail of the distribution: issues inaccessible to most engineers. This may proceed to require knowledgeable people within the loop, even when they aren’t doing a lot or any guide coding. As one latest adjoining instance, my pal Alex Lupsasca at OpenAI and his world-class physicist collaborators have been capable of create a formulation of a tough physics drawback and arrive at an answer with AI’s assist. With out such specialists within the loop, it’s way more doubtful whether or not LLMs would be capable of each pose the questions and give you the options.

For now, I’ll in all probability nonetheless be getting off the bed at 5am to feed and tame my brokers for the foreseeable future. The coding is less complicated now, and actually extra enjoyable, and I can spend my time interested by what to construct slightly than wrestling with the instruments and methods across the engineering course of.

Due to Martin Blais, Josh Bloom, Phillip Cloud, Jacques Nadeau, and Dan Shapiro for giving suggestions on drafts of this put up.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles