The Fallacies of AI-Assisted Programming
Eight assumptions we are silently making about the new abstraction layer, and the playbook that priced them once before
In April 2026, an autonomous coding agent hit a credential mismatch, decided that deleting the database would clear it, and wiped production in nine seconds. The backups went with it. They lived in the same volume the agent deleted. The recovery it chose was the disaster.
The failure was new. The shape of it was not.
In 1994, a researcher at Sun named Peter Deutsch wrote down eight assumptions that distributed-systems engineers were silently making. The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. Topology doesn’t change. There is one administrator. Transport cost is zero. The network is homogeneous. The list was short. The list was specific. And the list was uncomfortable, because many engineers reading it had violated several of them at any given time.
Three decades later, those eight sentences are still load-bearing in every distributed-systems postmortem worth reading. APNIC ran a 21-years-later retrospective in December 2025 confirming what every on-call engineer already knew. The fallacies did not get solved. They got priced.
We are doing it again, at a higher abstraction layer.
The trade distributed systems made, twenty years ago
Distributed systems made a trade twenty years ago. More defects, faster recovery, net win. Once the network became the substrate, you stopped trying to prevent every failure and started engineering for mean time to recovery instead of mean time between failures. AI-assisted programming is making the same trade, a layer up, by handing authorship to something non-deterministic. Humans were never deterministic either, but the new author is jagged to a higher degree: it fills gaps without flagging them, lacks the common sense to seek out what it hasn’t been shown, and (where it controls the workflow) substitutes reasoning for the static orchestration you used to write yourself. The fallacies that fall out of that (non-determinism at the output, verification at the review, drift at the model, blast radius at the agent) are the same shape as Deutsch’s network fallacies, and they yield to the same playbook. Name them. Price them. Engineer the recovery. That is the move that earned the SRE discipline twenty years of leverage. It is the move that earns AI-assisted programming the same kind of leverage, on a faster clock.
The fallacies did not get solved. They got priced.
These eight are not a one-to-one re-map of Deutsch’s list. They are the same shape: assumptions that were safe in the old model and quietly false in the new one. I have sequenced them in the order most engineering teams hit them in practice, not in Deutsch’s canonical order. The thread that ties them together is the Metaphorex framing that holds the whole Deutsch literature up: “each fallacy names a specific locality assumption that engineers unconsciously import from single-machine programming.” That is the move worth borrowing. The assumptions we are importing all come from a world where the author was a single, minimally jagged, accountable human whose knowledge had edges it knew about. The new author breaks every part of that.
The model output can be made deterministic
Setting temperature=0 does not make a large language model deterministic. It is among the most common misconceptions in production LLM engineering. Tianpan documents it at scale on a Qwen3-235B setup: same prompt, same model, temperature=0, one thousand runs, 80 distinct completions. The mechanisms sit below the application layer, in how the inference hardware schedules and batches the math. No model change, no prompt change, just parallel hardware doing what it does at scale.
The labs are working on it. Microsoft Research’s LLM-42 is a serious attempt at scheduling-based determinism, so this is not unsolvable in principle. It is unsolved in practice. In every production system running today, “same input, same output” is silently violated, and the plans built on top of that assumption are spending a budget no one costed.
The cost of code has gone to zero
When agents were building brochure sites, the cost felt that way. Now that real work is in scope, two things shifted. Agents burn more tokens just on the happy path through production complexity. They also chase far more dead ends before they get there.
The free-lunch framing collapsed in early 2026. GitHub Copilot moved to AI Credit billing. OpenAI shifted Codex to token pricing. Cursor and Windsurf raised their Pro tiers. The vendor framing of “unlimited AI coding” got quietly retired, and LeadDev called it.
The numbers that land are the runaway ones. A single Cursor user burned $4,200 in API fees over a weekend during an autonomous refactor. Two LangChain agents in an infinite conversation for eleven days landed a $47,000 bill. These read as edge cases until you notice they sit on the tail of a distribution every team is now exposed to.
The cost-runaway mechanism is structural. Agents re-send accumulated context on every step. A five-step loop costs 3.2× a single chatbot call. A fifty-step loop, 30×. A two-hundred-step debugging session, 100×. Average agentic developer spend in 2026 is now $400–$1,500 per month. Bryan Catanzaro, Nvidia’s VP of applied deep learning, said the quiet part out loud: “for my team, the cost of compute is far beyond the costs of the employees.” Anthropic, for its part, blocked Claude Pro/Max subscribers from running third-party agent frameworks because flat-rate pricing did not survive contact with autonomous loops.
Per-token cost has fallen 280× in two years, which is the number vendors quote. The number they do not quote is that the recent move at the frontier is up: SignalBloom’s pricing analysis tracks GPT-5.5 roughly tripling over GPT-5, and Opus 4.7 consuming 32–47% more tokens per task than its predecessor. Enterprise AI spend rose 320% over the same two years, because usage exploded faster than the headline price fell, and inference is now 85% of enterprise AI budget, up from ~40% in 2023. The economics did not get better. They got more elastic, and the elasticity went the wrong way. The token bill is one face of it. The reviewer’s calendar is the other, which is the next fallacy.



