AI can code, but it can't build software

(bytesauna.com)

Comments

simonw 19 hours ago
This is a good headline. LLMs are remarkably good at writing code. Writing code isn't the same thing as delivering working software.

A human expert needs to identify the need for software, decide what the software should do, figure out what's feasible to deliver, build the first version (AI can help a bunch here), evaluate what they've built, show it to users, talk to them about whether it's fit for purpose, iterate based on their feedback, deploy and communicate the value of the software, and manage its existence and continued evolution in the future.

Some of that stuff can be handled by non-developer humans working with LLMs, but a human expert needs who understands code will be able to do this stuff a whole lot more effectively.

I guess the big question is if experienced product management types can pick up enough coding technical literacy to work like this without programmers, or if programmers can pick up enough enough PM skills to work without PMs.

My money is on both roles continuing to exist and benefit from each other, in a partnership that produces results a lot faster because the previously slow "writing the code" part is a lot faster than it used to be.

jumploops 19 hours ago
I've been forcing myself to "pure vibe-code" on a few projects, where I don't read a single line of code (even the diffs in codex/claude code).

Candidly, it's awful. There are countless situations where it would be faster for me to edit the file directly (CSS, I'm looking at you!).

With that said, I've been surprised at how far the coding agents are able to go[0], and a lot less surprised about where I need to step in.

Things that seem to help: 1. Always create a plan/debug markdown file 2. Prompt the agent to ask questions/present multiple solutions 3. Use git more than normal (squash ugly commits on merge)

Planning is key to avoid half-brained solutions, but having "specs" for debug is almost more important. The LLM will happily dive down a path of editing as few files as possible to fix the bug/error/etc. This, unchecked, can often lead to very messy code.

Prompting the agent to ask questions/present multiple solutions allows me to stay "in control" over the how something is built.

I now basically commit every time a plan or debug step is complete. I've tried having the LLM control git, but I feel that it eats into the context a bit too much. Ideally a 3rd party "agent" would handle this.

The last thing I'll mention is that Claude Code (Sonnet 4.5) is still very token-happy, in that it eagerly goes above and beyond when not always necessary. Codex (gpt-5-codex) on the other hand, does exactly what you ask, almost to a fault. For both cases, this is where planning up-front is super useful.

[0]Caveat: the projects are either Typescript web apps or Rust utilities, can't speak to performance on other languages/domains.

pron 18 hours ago
> I don’t really know why AI can't build software (for now)

Could be because programming involves:

1. Long chains of logical reasoning, and

2. Applying abstract principles in practice (in this case, "best practices" of software engineering).

I think LLMs are currently bad at both of these things. They may well be among the things LLMs are worst at atm.

Also, there should be a big asterisk next to "can write code". LLMs do often produce correct code of some size and of certain kinds, but they can also fail at that too frequently.

Kim_Bruning 4 hours ago
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

   --Charles Babbage 

We have now come to the point where you CAN put in the wrong figures and sometimes the right answer comes out (possibly over half the time!). This was and is incredible to me and I feel lucky to be alive to see it.

However, people have taken that to mean that you can ask any old question any old way and have the right answer come out now. I might at one point have almost thought so myself. But LLMs currently are definitely not there yet.

Consider (eg) Claude Code to be your English SHell (Compare: zsh, bash).

Learn what it can and can't do for you. It's messier to learn than straight and/or/not; and I'm not sure there's manuals for it; and any manual will be outdated next quarter anyway; but that's the state of play at this time.

orliesaurus 19 hours ago
Software engineering has always been about managing complexity, not writing code. Code is just the artifact. No-code, low-code is all code but doesn't make for a good software engineered application
Calamityjanitor 18 hours ago
I feel you can apply this to all roles. When models passed highschool exam benchmarks, some people talked as if that made the model equivalent to a person passing highschool. I may be wrong, but I bet even an state of the art LLM couldn't complete high school. You have to do things like attending classes at the right time/place, take initiative, keep track of different classes. All of the bigger picture thinking and soft skills that aren't in a pure exam.

Improving this is what everyone's looking into now. Even larger models, context windows, adding reasoning, or something else might improve this one day.

subtlesoftware 19 hours ago
True for now because models are mainly used to implement features / build small MVPs, which they’re quite good at.

The next step would be to have a model running continuously on a project with inputs from monitoring services, test coverage, product analytics, etc. Such an agent, powered by a sufficient model, could be considered an effective software engineer.

We’re not there today, but it doesn’t seem that far off.

KurSix 5 hours ago
This whole situation painfully reminds me of the low-code/no-code boom from like 5–10 years ago.

Back then everyone was saying developers would become obsolete and business analysts would just “click together” enterprise solutions. In the end, we got a mess of clunky non-scalable systems that still had to be fixed and integrated by the same engineers.

LLMs are basically low-code on steroids - they make it easier to build a prototype, but exponentially harder to turn it into something actually reliable.

eterm 18 hours ago
I've been experimenting with a little vibe coding.

I've generally found the quality of .NET to be quite good. It trips up sometimes when linters ping it for rules not normally enforced, but it does the job reasonably well.

The front-end javascript though? It's both an absolute genuis and a complete menace at the same time. It'll write reams of code to gets things just right but with no regards to human maintainability.

I lost an entire session to the fact that it cheerfully did:

    npm install fabric
    npm install -D @types/fabric
Now that might look fine, but a human would have realised that the typings library is a completely different out-dated API, the package last updated 6 years ago.

Claude however didn't realise this, and wrote a ton of code that would pass unit tests but fail the type check. It'd check the type checker, re-write it all to pass the type checker, only for it now to fail the unit tests.

Eventually it semi-gave up typing and did loads of (fabric as any) all over the place, so now it just gave runtime exceptions instead.

I intervened when I realised what it was doing, and found the root cause of it's problems.

It was a complete blindspot because it just trusted both the library and the typechecker.

So yeah, if you want to snipe a vibe coder, suggest installing fabricjs with typings!

aurintex 3 hours ago
This is a great read and something I've been grappling with myself.

I've found it takes significant time to find the right "mode" of working with AI. It's a constant balance between maintaining a high-level overview (the 'engineering' part) while still getting that velocity boost from the AI (the 'coding' part).

The real trap I've seen (and fallen into) is letting the AI just generate code at me. The "engineering" skill now seems to be more about ruthless pruning and knowing exactly what to ask, rather than just knowing how to write the boilerplate.

abhishekismdhn 17 hours ago
Even the code quality is often quite poor. At the same time, not using critical thinking can have serious consequences for those who treat AI as more than an explorer or companion. You might think that with AI, the number of highly skilled developers would increase but it could be quite the opposite. Code is just a medium; developers are paid to solve problems, not to write code. But writing code is still important as it refines your thoughts and sharpens your problem-solving skills.

The human brain learns through mistakes, repetition, breaking down complex problems into simpler parts, and reimagining ideas. The hippocampus naturally discards memories that aren’t strongly reinforced.. so if you rely solely on AI, you’re simply not going to remember much.

hamasho 19 hours ago
The problem with vibe coding is it demoralizes experienced software engineers. I'm developing a MVP with vibes and Claude Code and Codex output work in many cases for this relatively new project. But the quality of code is bad. There is already duplicated or unused logic, a lot of code is unnecessarily complex (especially React and JSX). And there's little PR reviews so that "we can keep velocity". I'm paying much less attention for quality now. After all, why bother when AI produce working code? I can't justify and don't have energy for deep-diving system design or dozens of nitpicking change requests. And it makes me more and more replaceable by LLM.
aayushdutt 2 hours ago
It's just the frontier getting pushed slowly but surely. The headline missed the keyword `yet`.
dreamcompiler 16 hours ago
I've worked in a few teams where some member of the [human] team could be described as "Joe can code, but he can't build software."

The difference is what we used to call the "ilities": Reliability, inhabitability, understandability, maintainability, securability, scalability, etc.

None of these things are about the primary function of the code, i.e. "it seems to work." In coding, "it seems to work" is good enough. In software engineering, it isn't.

bradfa 19 hours ago
The context windows are still dramatically too small and the models aren’t yet seeming to train on how to build maintainable software. There is a lot less written down about how to do this on the public web. There’s a bunch of high level public writing but not may great examples of real world situations that happen on every proprietary software project, because that’s very messy data locked away internal to companies.

I’m sure it’ll improve over time but it won’t be nearly as easy as making ai good at coding.

Animats 16 hours ago
OK, he makes a statement, and then just stops.

In some ways, this seems backwards. Once you have a demo that does the right thing, you have a spec, of sorts, for what's supposed to happen. Automated tooling that takes you from demo to production ready ought to be possible. That's a well-understood task. In restricted domains, such as CRUD apps, it might be automated without "AI".

sothatsit 13 hours ago
I like to think of it like AI can code, but it is terrible at making design decisions.

Vibe-coded apps eventually fall over as they are overwhelmed by 101 bad architectural decisions stacked on top of one another. You need someone technical to make those decisions to avoid this fate.

gherkinnn 12 hours ago
It is only a matter of years for all the idea guys in my org to realise this.

"But AI can build this in 30min"

thegrim33 17 hours ago
And here I am, using AI twice within the last 12 hours, to ask it two questions about an extremely well used, extremely well documented, physics library, and both times having it return to me sample code which makes use of library methods which don't exist. When I tell it this, I get the "Oh, you're so right to point that out!" response, and get new code returned, which still just blatantly doesn't work.
preommr 19 hours ago
These discussions are so tiring.

Yes, they're bad now, but they'll get better in a year.

If the generative ability is good enough for small snippets of code, it's good enough for larger software that's better organized. Maybe the models don't have enough of the right kind of training data, or the agents don't have the right reasoning algorithms. But it is there.

smugtrain 7 hours ago
Making it absolutely lovely for people who can build software, but can’t code
zeckalpha 18 hours ago
I think this can be extended (but not necessarily fully mitigated) by working with non-SWE agents interacting with the same codebase. Drafting product requirements, assess business opportunities, etc. can be done by LLMs.
ruguo 16 hours ago
True. AI might not have a soul, but it’s become an absolute lifesaver for me.

To really get the most out of it though, you still need to have solid knowledge in your own field.

CMCDragonkai 18 hours ago
Many human devs can code, but few can build software.
liqilin1567 15 hours ago
Every time I see a "build an app with just one English sentence" hype, I turn away immediately
xeckr 16 hours ago
Give it a year or two...
johnnienaked 14 hours ago
Quit saying AI can code. AI can't do anything that wasn't done by actual humans before. AI is a plagiarism machine.
jongjong 15 hours ago
I still can't believe my own eyes that when I show an LLM my codebase and I tell it what functionality I want to add in reasonable detail, it can produce perfect looking code that I could have written myself.

I would say that AI is better at coding than most developers. If I had the option to choose between a junior developer to assist me or Claude Code, I would choose Claude Code. That's a massive achievement. Cannot be understated.

It's a dream come true for someone with a focus on architecture like myself. The coding aspect was dragging me down. LLMs work beautifully with vanilla JavaScript. The combined ability to generate code quickly and then quickly test (no transpilation/bundling step) gives me fast iteration times. Add that to the fact that I have a minimalist coding style. I get really good bang for my bucks/tokens.

The situation is unfortunate for junior developers. That said, I don't think it necessarily means that juniors should abandon the profession; they just need to refocus their attention towards the things that AI cannot do well like spotting contradictions and making decisions. Many developers are currently not great at this; maybe that's the reason why LLMs (which are trained on average code) are not good at it either. Juniors have to think more critically than ever before; on the plus side, they are freed to think about things at a higher level of abstraction.

My observation is that LLMs are so far good news for neurodivergent developers. Bad news for developers who are overly mimetic in their thinking style and interests. You want to be different from the average developer whose code the LLM was trained on.

cdelsolar 16 hours ago
I definitely disagree. I'm a software engineer, but have been heavily using AI the last few months and have gotten multiple apps to production since then. I have to guide the LLM along, yes, but it's perfectly capable of doing everything needed up to and including building the cloudformation templates for Fargate or whatever.
aussieguy1234 17 hours ago
I'm of the opinion that not a single software engineer has yet lost their job to AI.

Any company claiming they've replaced engineers with AI has done so in an attempt to cover up the real reasons they've gotten rid of a few engineers. "AI automating our work" sounds much better to investors than "We overhired and have to downsize".

apical_dendrite 18 hours ago
I've been working with a data processing pipeline that was vibe-coded by an AI engineer, and while the code works, as software that has to fit into a production environment, it's a mess. Take logging for example. The pipeline is made up of AWS lambdas written in python. The person who built it wanted to add context to each log for debugging and the LLM generated hundreds of lines of python in each lambda to do this (no common library). But he (and the LLM) didn't understand that there were a bunch of files that initialized their own loggers at the top of the file, so all that code to set context in the root logger wouldn't get used in those files. And then he wanted to parallelize some tasks, and both he and the LLM didn't understand that the logging context was thread-local and wouldn't show up in logs generated in another thread. So what we ended up with was 250+ line logging_config.py files in each individual lambda that were only used for a small portion of the logs generated by the application.
orionblastar 19 hours ago
I see so many people on the Internet who claim they can fix AI VIBE Code. Nothing new I've been Super Debugging crappy code for 30 years to make it work.
ergocoder 18 hours ago
Yeah, just like many software engineers. AI has achieved software engineering.
jongjong 13 hours ago
Software development is one of these things which often seems really easy from the outside but can be insanely complicated.

I had this experience with my co-founder where I was shipping features quickly and he got used to a certain pace of progress. Then we ended up with like 6 different ways to perform a particular process with some differences between them; I had reused as much code as possible; all passing through the same function but without tests, it became challenging to avoid bugs/regressions... My co-founder could not understand why I was pushing back on implementing a particular feature which seemed very simple to him at a glance.

He could not believe me why I was pushing back. Thought I was just being stubborn. I explained to him all the technical challenges involved and it took me like 30 minutes to explain (at a high level) all the technical considerations and trade-offs and how much complexity would be introduced by adding this new feature and he agreed with my point of view.

People who aren't used to building software cannot grasp the complexity. Beyond a certain point, it's like every time my co-founder asked me to do something related to a particular part of the code, I'd spend several minutes pointing out the logical contradictions in his own requirements. The non-technical person thinks about software development in a kind of magical way. They don't really understand what they're asking. This isn't even getting into the issue of technical constraints which is another layer.

nsonha 14 hours ago
So are software engineers. Many can, but there is nothing in the definition of the "engineer" (software or otherwise) concept imply that they can build things.
jongjong 15 hours ago
>> hey, I have this vibe-coded app, would you like to make it production-ready

This makes me cringe because it's a lot harder to get LLMs to generate good code when you start with a crappy codebase. If you start with a good codebase, it's like the codebase is coding itself. The former approach trying to get the LLM to write clean code is akin to mental torture, the second approach is highly pleasant.

asah 15 hours ago
FTFY: "for now"