The unbounded development team: promise and perils of AI coding assistants

Once upon a time, there was a software engineer we will call “Bob.” Bob worked in a large technology company that followed a traditional waterfall model of development, with separation of roles between program managers (“PM”) who defined the functional requirements and software design engineers (“SDE”) responsible for writing the code to turn those specs into reality. But program managers did not dream up specifications out of thin air: like every responsible corporate denizen, they were cognizant of so-called “resource constraints”— euphemism for available developer time.

“Features, quality, schedule; fix any two and the remaining one is automatically determined.”

Schedules are often driven by external factors such as synchronizing with an operating system release or hitting the shelves in time for the Christmas shopping season. Product teams have little influence over these hard deadlines. Meanwhile no self-respecting PM wants sacrifice quality. “Let’s ship something buggy with minimal testing that crashes half the time” is not a statement that a professional is supposed to write down on paper— although no doubt many have expressed that sentiment in triage meetings when hard decisions must be made as the release deadline is approaching. That leaves features as the knob easiest to tweak and this is where developer estimates comes in.

Bob had an interesting quirk. Whenever he was asked to guesstimate the time required to implement some proposed product feature, a strictly bimodal distribution was observed with two peaks:

  • Half-day
  • Two weeks

Over time a pattern emerged: features Bob approved of seemed to fit in an afternoon, even when they seemed quite complicated and daunting to other software engineers who preferred to steer clear of those implementation challenges. Other features that seemed straightforward on the surface were recast by Bob as a two-week long excursion into debugging tar-pits.

In Bob’s defense: estimating software schedules is a notoriously difficult problem that every large-scale project has suffered from since long before Fred Brooks made his immortal observations about the mythical man-month. Also Bob would not be the first or last engineer in history whose estimates were unduly influenced by a certain aesthetic judgment of the proposal. Highly bureaucratic software development shops prevalent in the 20th century relegated engineers to the role of errand boys/girls, tasked with the unglamorous job of “merely” implementing brilliant product visions thrown over the wall from the PM organization. Playing games with fudged schedule estimates becomes the primary means of influencing product direction in those dysfunctional environments. (It did not help that in these regimented organizations, program management and senior leadership were often drawn from non-technical backgrounds, lacking the credibility to call shenanigans on bogus estimates.)

Underlying this bizarre dynamic is the assumption that engineering time is scarce. There is an abundance of brilliant feature ideas that could delight customers or increase market share— if only their embodiment as running code can see the light of day.

AI coding assistants such as Codex and their integration into agentic development flows have now turned that wisdom on its head. It is easier than ever to go from idea to execution, from functional requirements to code-complete, with code that is actually complete: with a suite of unit tests, properly commented and separately documented. “Code is king” or “code always wins” used to be the thought-terminating cliché at large software companies: implying that a flawed, half-baked idea implemented in working code is better than the most elegant but currently theoretical idea on the drawing board. It is safe to say this code-cowboy mentality idolizing implementation over design is completely bankrupt: it is easier than ever to turn ideas into working applications. Those ideas need not even be expressed in some meticulous specification document with sections dedicated to covering every edge case. Vibe-coding is lowering the barrier to entry across the board, not just for implementation knowledge. When it comes to prompting LLMs, precision in writing still matters. Garbage-in-garbage-out still holds. But being able to specify requirements in a structured manner with UML or other formal language is not necessary. If anything the LLM can reverse-engineer that after the fact from its own implementation— in a hilarious twist on another tenet of the code-cowboy mindset: “the implementation is the spec.”

There is an irony here that LLMs have delivered in the blink of an eye the damage experts had once prognosticated/feared outsourcing could wreak on the industry: turn software implementation from being the most critical aspect of development practiced by rarefied talent to a commodity that could be shipped off to the lowest bidder in Bangalore. (The implications of this change on the “craft” of development are already being lamented.)

The jury is still out on whether flesh-and-blood developers can maintain that torrent of code generated by AI down the road, should old-fashioned manual modifications ever prove necessary. One school of thought expects a looming disaster: clueless engineers blindly shipping code they do not understand to production, knowing full well they are on the hook for troubleshooting when things go sideways. No doubt some are betting they will have long moved on and that responsibility will fall on the shoulders of some other unfortunate soul tasked with making sense of the imperfectly functioning yet perfectly incomprehensible code spat out by AI. Another view says such concerns are about as archaic as fretting over a developer having to jump in and hand-optimize or worse hand-correct assembly language generated by their compiler. In highly niche esoteric or niche of development where LLMs lack sufficient samples to train properly, it may well happen that human judgment is still necessary to achieve correctness. But for most engineers plan B for a misbehaving LLM assistant is asking a different LLM assistant to step in to debug its way out of the mess.

Software designers are now confronted with a variant of the soul-searching question: “If you knew you could not fail, what would you do?” For software projects, failure is and will remain very much an option. But its root causes are bound to be different. LLMs have taken the most romanticized view of failed projects off the table: ambitious product vision crashing against the hard reality of finite engineering time or limited developer expertise failing to rise to the occasion. Every one can now wield the power of a hundred-person team composed of mercenary engineers with expertise in every imaginable specialty from low-level systems programming to tweaking webpage layouts. That does not guarantee success but it does ensure the eventual outcome will take place on a scale grander than possible before. Good ideas will receive their due and reach their target market, no longer held back by mismatch between willpower and resources, or the vagaries of chancing upon the right VC willing to bankroll the operation.

At least, that is the charitable prediction. Downside is the same logic goes for terrible ideas too: they will also be executed to perfection. Perhaps those wildly skewed schedule estimates from engineer Bob served a purpose after all: they were a not-so-subtle signal that some proposed feature was a Bad Idea™ that has not been thought through properly. Notorious for sycophancy, AI coding assistants are the exact opposite of that critical mindset. They will not push-back. They will not question underlying assumptions or sanity-check the logic behind the product specification. They will simply carry out the instructions as prompted in what may well become the most pervasive example of “malicious compliance.” In the same way that social media bestowing everyone a bullhorn did not improve the average quality of discourse on the internet, giving every aspiring product manager the equivalent of 100 developers working around the clock to implement their whims is unlikely to yield the next game-changing application. If anything, making engineering costs “too cheap to meter” may result in companies doubling down on obviously failing ideas for strategic reasons. Imagine if Microsoft did not have to face the harsh reality of market discipline, but could keep iterating on Clippie or Vista indefinitely in hopes that the next iteration will finally take off. In a world where engineering time is scarce, companies are incentivized to cull failures early, to redirect precious resources towards more productive avenues. Those constraints disappear when shipping one more variant of the same bankrupt corporate shibboleth—think Google Buzz/Wave/Plus, Amazon Fire phone, Windows mobile platform, Apple Vision Pro— is just a matter of instructing the LLM to “think harder” and spin a few more hundreds hours iterating on the codebase.

CP