The new calculus of AI-based coding

(blog.joemag.dev)

Comments

jmull 6 hours ago
> Over the past three months... [we] have been building something really cool

The claim is a fast-moving, high performing team has become a 10x fast moving, high-performing team. That's equivalent to 2-1/2 years of development across a team.

Shall we expect the tangible results soon?

I'm perfectly willing to accept that AI coding will make us all a lot more productive, but I need to see the results.

ang_cire 13 hours ago
As a security researcher, I am both salivating at the potential that the proliferation of TDD and other AI-centric "development" brings for me, and scared for IT at the same time.

Before we just had code that devs don't know how to build securely.

Now we'll have code that the devs don't even know what it's doing internally.

Someone found a critical RCE in your code? Good luck learning your own codebase starting now!

"Oh, but we'll just ask AI to write it again, and the code will (maybe) be different enough that the exact same vuln won't work anymore!" <- some person who is going to be updating their resume soon.

I'm going to repurpose the term, and start calling AI-coding "de-dev".

Animats 21 hours ago
> Instead, we use an approach where a human and AI agent collaborate to produce the code changes. For our team, every commit has an engineer's name attached to it, and that engineer ultimately needs to review and stand behind the code. We use steering rules to setup constraints for how the AI agent should operate within our codebase,

This sounds a lot like Tesla's Fake Self Driving. It self drives right up to the crash, then the user is blamed.

zeroq 15 hours ago
When Karpathy wrote Software 2.0 I was super excited.

I naively believed that we'll start building black boxes based on requirements, sets of inputs and outputs, and sudden changes of heart from stakeholders that often happen on a daily basis for many of us and mandates almost complete reimagination of project architecture will simply need another pass of training with new parameters.

Instead the mainstream is pushing hard reality where we mass produce a ton of code until it starts to work within guard rails.

  Does it really work? Is it maintainable?
  Get out of here. We're moving at 200mph.
Gud 3 hours ago
People bemoan unrecognisable code bases, maybe that is true.

I’ve sure used various LLMs to solve difficult nuts to crack. Problems I have been able to verbalise, but unable to solve.

Chances are that if you are using an LLM to mass produce boiler, you are writing too much boiler.

Zanfa 10 hours ago
IMO the biggest issue with AI code is that writing code is the easiest part of software development. Reviewing code is so much more difficult than writing it, even more so if you're not already intimately familiar with it in the first place.

It's like with AI images, where they look plausible at first, but then you start noticing all the little things that are off in the sidelines.

sherinjosephroy 13 hours ago
This is an interesting take — shifting focus from “writing the best code” to “defining the right tests” makes sense in an AI-driven world. But I’m skeptical if treating the generated code as essentially disposable is wise — tests can catch a lot, but they won’t automatically enforce readability, maintainability, or ensure unexpected behaviors don’t slip through
philipp-gayret 21 hours ago
This is the first time I see "steering rules" mentioned. I do something similar with Claude, curious how it looks for them and how they integrate it with Q/Kiro.
zkmon 10 hours ago
Is it exciting because work happens at 200mph or is it because you get that much business advantage against your competition? Or is it because, now it allows you to spend only one hour at work per day?

To quote Joey from Friends - "400 bucks are gone from my pocket and nobody is getting happier?"

reenorap 20 hours ago
No.

The way to code going forward with AI is Test Driven Development. The code itself no longer matters. You give the AI a set of requirements, ie. tests that need to pass, and then let it code whatever way it needs to in order to fulfill those requirements. That's it. The new reality us programmers need to face is that code itself has an exact value of $0. That's because AI can generate it, and with every new iteration of the AI, the internal code will get better. What matters now are the prompts.

I always thought TDD was garbage, but now with AI it's the only thing that makes sense. The code itself doesn't matter at all, the only thing that matters is the tests that will prove to the AI that their code is good enough. It can be dogshit code but if it passes all the tests, then it's "good enough". Then, just wait a few months and then rerun the code generation with a new version of the AI and the code will be better. The humans don't need to know what the code actually is. If they find a bug, write a new test and force the AI to rewrite the code to include the new test.

I think TDD has really found its future now that AI coding is here to stay. Human code doesn't matter anymore and in fact I would wager that modifying AI generated code is as bad and a burden. We will need to make sure the test cases are accurate and describe what the AI needs to generate, but that's it.

keeda 12 hours ago
This article is right, but I think it may underplay the changes that could be coming soon. For instance, as the top comment here about TDD points out, the actual code does not matter anymore. This is an astounding claim! And it has naturally received a lot of objections in the replies.

But I think the objections can mostly be overcome with a minor adjustment: You only need to couple TDD with a functional programming style. Functional programming lets you tightly control the context of each coding task, which makes AI models ridiculously good at generating the right code.

Given that, if most of your code is tightly-scoped, well-tested components implementing orthogonal functionality, the actual code within those components will not matter. Only glue code becomes important and that too could become much more amenable to extensive integration testing.

At that point, even the test code may not matter much, just the test-cases. So as a developer you would only really need to review and tweak the test cases. I call this "Test-Case-Only Development" (TCOD?)

The actual code can be completely abstracted away, and your main task becomes design and architecture.

It's not obvious this could work, largely because it violates every professional instinct we have. But apparently somebody has even already tried it with some success: https://www.linkedin.com/feed/update/urn:li:activity:7196786...

All the downsides that have been mentioned will be true, but also may not matter anymore. E.g. in a large team and large codebase, this will lead to a lot of duplicate code with low cohesion. However, if that code does what it is supposed to and is well-tested, does the duplication matter? DRY was an important principle when the cost of code was high, and so you wanted to have as much leverage as possible via reuse. You also wanted to minimize code because it is a liability (bugs, tech debt, etc.) and testing, which required even more code that still didn't guarantee lack of bugs, was also very expensive.

But now that the cost of code is plummeting, that calculus is shifting too. You can churn out code and tests (including even performance tests, which are always an afterthought, if thought of at all) at unimaginable rates.

And all this while reducing the dependencies of developers on libraries and frameworks and each other. Fewer dependencies means higher velocity. The overall code "goodput" will likely vastly outweight inefficiences like duplication.

Unfortunately, as TFA indicates, there is a huge impedance mismatch with this and the architectures (e.g. most code is OO, not functional), frameworks, and processes we have today. Companies will have to make tough decisions about where they are and where they want to get.

I suspect AI-assisted coding taken to its logical conclusion is going to look very different from what we're used to.

StilesCrisis 16 hours ago
The biggest thing that stood out to me was that they suddenly started working nonstop, even on weekends…? If AI is so great, why can’t they get a single day off in two months?
bcrosby95 19 hours ago
It's amazing that their metrics exactly match the mythical "10x engineer" in productivity boost.
brazukadev 20 hours ago
But here's the critical part: the quality of what you are creating is way lower than you think, just like AI-written blog posts.
gachaprize 21 hours ago
Classic LLM article:

1) Abstract data showing an increase in "productivity" ... CHECK

2) Completely lacking in any information on what was built with that "productivity" ... CHECK

Hilarious to read this on the backend of the most widely publicized AWS failure.

jcgrillo 6 hours ago
> For our team, every commit has an engineer's name attached to it, and that engineer ultimately needs to review and stand behind the code.

Then they claim (and demonstrate with a picture of a commits/day chart) a team-wide 10x throughput increase. I claim there's got to be a lot of rubber-stamp reviewing going on here. It may help to challenge the "author" to explain things like "why does this lifetime have the scope it does?" or "why did you factor it this way instead of some other way?" e.g. questions which force them to defend the "decisions" they made. I suspect if you're actually doing thorough reviews that the velocity will actually decrease instead of increase using LLMs.

r0x0r007 19 hours ago
"For me, roughly 80% of the code I commit these days is written by the AI agent" Therefore, it is not commited by you, but by you in the name of AI agent and the holy slop. What to say, I hope that 100x productivity is worth it and you are making tons of money. If this stuff becomes mainstream, I suggest open source developers stop doing the grind part, stop writing and maintaining cool libraries and just leave all to the productivity guys, let's see how far they get. Maybe I've seen too many 1000x hacker news..
rob_c 11 hours ago
Trust but verify. It's not hard.

The corollary being. If you can't (through skill or effort) verify don't trust.

If you break this pattern you deserve all the follies that become you as a "professional".

cadamsdotcom 20 hours ago
"We have real mock versions of all our dependencies!"

Congratulations, you invented end-to-end testing.

"We have yellow flags when the build breaks!"

Congratulations! You invented backpressure.

Every team has different needs and path dependencies, so settles on a different interpretation of CI/CD and software eng process. Productizing anything in this space is going to be an uphill battle to yank away teams' hard-earned processes.

Productizing process is hard but it's been done before! When paired with a LOT of spruiking it can really progress the field. It's how we got the first CI/CD tools (eg. https://en.wikipedia.org/wiki/CruiseControl) and testing libraries (eg. pytest)

So I wish you luck!

Madmallard 16 hours ago
Correct TDD involves solving all the hard problems in the process. What gain does AI give you then?
tcherasaro 13 hours ago
Another day, and another smart person finally discovers the benefits of leveraging AI to write code.
exasperaited 21 hours ago
Absolutely none of that article has ever even so much as brushed past the colloquial definition of "calculus".

These guys actually seem rattled now.

skinnymuch 21 hours ago
Interesting enough to me though I only skimmed.

I switched back to Rails for my side project a month ago and ai coding when doing not too complex stuff has been great. While the old NextJS code base was in shambles.

Before I was still doing a good chunk of the NextJS coding. I’m probably going to be directly coding less than 10% of the code base from here on out. I’m now spending time trying to automate things as much as possible, make my workflow better, and see what things can be coded without me in the loop. The stuff I’m talking about is basic CRUD and scraping/crawling.

For serious coding, I’d think coding yourself and having ai as your pair programmer is still the way to go.

moron4hire 20 hours ago
If you are producing real results at 10x then you should be able to show that you are a year ahead of schedule in 5 weeks.

Waiting to see anyone show even a month ahead of schedule after 6 months.

Madmallard 18 hours ago
Lots of reasonable criticisms being down-voted here. Are we being AstroTurfed? Is HN falling victim to the AI hype train money too now?
Madmallard 22 hours ago
first the Microsoft guy touting agents

now AWS guy doing it !

"My team is no different—we are producing code at 10x of typical high-velocity team. That's not hyperbole - we've actually collected and analyzed the metrics."

Rofl

"The Cost-Benefit Rebalance"

In here he basically just talks about setting up mock dependencies and introducing intermittent failures into them. Mock dependencies have been around for decades, nothing new here.

It sounds like this test system you set up is as time consuming as solving the actual problems you're trying to solve, so what time are you saving?

"Driving Fast Requires Tighter Feedback Loop"

Yes if you're code-vomiting with agents and your test infrastructure isn't rock solid things will fall apart fast, that's obvious. But setting up a rock solid test infrastructure for your system involves basically solving most of the hard problems in the first place. So again, what? What value are you gaining here?

"The communication bottleneck"

Amazon was doing this when I worked there 12 years ago. We all sat in the same room.

"The gains are real - our team's 10x throughput increase isn't theoretical, it's measurable."

Show the data and proof. Doubt.

Yeah I don't know. This reads like complete nonsense honestly.

Paraphrasing: "AI will give us huge gains, and we're already seeing it. But our pipelines and testing will need to be way stronger to withstand the massive increase in velocity!"

Velocity to do what? What are you guys even doing?

Amazon is firing 30,000 people by the way.

zbyforgotp 12 hours ago
TLDR: ai changes the economic calculus of software development. It makes automated testing more beneficial in comparison to costs.

I think he is right.

whiterook6 20 hours ago
This reads like "Hey, we're not vibe coding, but when we do, we're careful!" with hints of "AI coding changes the costs associated with writing code, designing features, and refactoring" sprinkles in to stand out.