I wonder at the end of this if it's the still worth the risk?
A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
Maybe it's the kind of work I'm doing, or maybe I just suck, but the code to me is a forcing mechanism into ironing out the details, and I don't get that when I'm writing a specification.
The post touches very briefly on linting in 7. For me, setting up a large number of static code analysis checks has had the highest impact on code quality.
My hierarchy of static analysis looks like this (hierarchy below is Typescript focused but in principle translatable to other languages):
9. Custom script to ensure shared/util directories are not over stuffed (built this using dependency-cruiser as a library rather than an exec)
10. Security check (semgrep)
I stitch all the above in a single `pnpm check` command and defined an agent rule to run this before marking task as complete.
Finally, I make sure `pnpm check` is run as part of a pre-commit hook to make sure that the agent has indeed addressed all the issues.
This makes a dramatic improvement in code quality to the point where I'm able to jump in and manually modify the code easily when the LLM slot machine gets stuck every now and then.
(Edit: added mention of pre-commit hook which I missed mention of in initial comment)
The real value that AI provides is the speed at which it works, and its almost human-like ability to “get it” and reasonably handle ambiguity. Almost like tasking a fellow engineer. That’s the value.
By the time you do everything outlined here you’ve basically recreated waterfall and lost all speed advantage. Might as well write the code yourself and just use AI as first-pass peer review on the code you’ve written.
A lot of the things the writer points out also feel like safeguards against the pitfalls of older models.
I do agree with their 12th point. The smaller your task the easier to verify that the model hasn’t lost the plot. It’s better to go fast with smaller updates that can be validated, and the combination of those small updates gives you your final result. That is still agile without going full “specifications document” waterfall.
In general, I prefer to do the top-level design and the big abstractions myself. I care a lot about cohesion and coupling; I like to give a lot of thought to my interfaces.
And in practice, I am happy enough that the LLM helps me to eliminate some toil, but I think you need to know when it is time to fold your cards, and leave the game. I prefer to fix small bugs in the generated code myself, than asking the agent, as it tends to go too far when fixing its own code.
I can't help but keep finding it ridiculous how everyone now discovers basic best practices (linting, documentation, small incremental changes) that have been known for ages. It's not needed because of AI, you should have been doing it like this before as well.
Remember having to write detailed specs before coding? Then folks realized it was faster and easier to skip the specs and write the code? So now are we back to where we were?
One of the problems with writing detailed specs is it means you understand the problem, but often the problem is not understand - but you learn to understand it through coding and testing.
Ironically, I use the time saved using agents to read technical books ferociously.
Coding agents made me really get something back from the money I pay for my O'Reilly subscription.
So, coding agents are making me a better engineer by giving me time to dive deeper into books instead of having to just read enough to do something that works under time pressure.
The best thing about this is that AI bots will read, train on and digest the million "how to write with AI" posts that are being written right now by some of the smartest coders in the world and the next gen AI will incorporate all of this, making them ironically unnecessary.
The forcing function doesn't disappear - it shifts. When you read and critique AI-generated code carefully, you get a similar cognitive workout: Why did it structure this that way? What edge case did it miss? How does this fit the broader architecture?
The danger is treating the output as a black box. If you skip the review step and just accept whatever it produces, yes, you'll lose proficiency and accumulate debt. But if you stay engaged with the code, reading it as critically as you would a junior dev's PR, you maintain your understanding while moving faster.
The technical debt concern is valid but it's a process problem, not an inherent flaw. We solved "juniors write bad code" with code review, linting, and CI. We can solve "LLMs write inconsistent code" with the same tools - hannofcart's 10-layer static analysis stack is a good example. The LLM lies about passing checks? Pre-commit hook catches it.
Define data structures manually, ask AI to implement specific state changes. So JSON, C .h or other source files of func sigs and put prompts in there. Never tried the Agents.md monolithic definition file approach
Also I demand it stick to a limited set of processing patterns. Usually dynamic, recursive programming techniques and functions. They just make the most sense to my head and using one style I can spot check faster.
I also demand it avoid making up abstractions and stick to mathematical semantics. Unique namespaces are not relevant to software in the AI era. It's all
about using unique vectors as keys to values.
Stick to one behavior or type/object definition per file.
Only allow dependencies that are designed as libraries to begin with. There is a ton of documentation to implement a Vulkan pipeline so just do that. Don't import an entire engine like libgodot.
And for my own agent framework I added observation of my local system telemetry via common Linux files and commands. This data feeds back in to be used to generate right-sized sched_ext schedules and leverage bpf for event driven responses.
Am currently experimenting with generation of small models of my own data. A single path of images for example not the entire Pictures directory. Each small model is spun akin to a Docker container.
LLMs are monolithic (massive) zip files of the entire web. No one really asking for that. And anyone who needs it already has access to the web itself
Too bad that software developers are carrying water for those who hate them and mock them for being obsolete in 6-12 months, while they are eating caviar (probably evading sanctions) and clink the champagne glasses in Davos:
1. Keep things small and review everything AI written, or
2. Keep things bloated and let AI do whatever it wants within the designated interface.
Initially I drew this line for API service / UI components, but it later expanded to other domains. e.g. For my hobby rust project I try to keep "trait"s to be single responsible, never overlap, easy to understand etc etc. but I never look at AI generated "impl"s as long as it passes some sensible tests and conforming the traits.
The first rule is an antipattern. I think describing your architecture or ANY kind of documentation for your AI is an anti-pattern and blows the context window leading to worse results, and actual more deviation.
The controlling systems are not give it more words at the start. Agentic coding needs to work in loop with dedicated context.
You need to think about how can i give as much intent as possible with as little words.
You can built a tremendous amount of custom lint rules ai never needs to read except they miss it.
Every pattern in your repo gets repeated, repo will always win over documentation and when your repo is good structured you don’t need to repeat this to AI
It’s like dev always has been, watch what has gone wrong and make sure the whole type or error can’t happen again.
I use it for scaffolding and often correct it for the layour I prefer. Then I use to check my code, and then scaffold in some more modules. I then connect them together.
Long as you review the code and correct it, it is no more different than using stackoverflow. A stack overflow that reads your code and helps stitch the context.
You still need to know the hard parts: precisely what you want to build, all domain/business knowledge questions solved, but this tool automates the rest of the coding and documentation and testing.
It's going to be a wild future for software development...
I found an easier way that Works For Me (TM). I describe the problem to LLM and ask it to solve it step by step, but strictly in the Ask mode, not Agent. Then I copy or even type the linws to the code. If I wouldn't write the line myself, it doesn't go in, and I iterate some more.
I do allow it to write the tests (lots of typing there), but I break them manually to see how they fail. And I do think about what the tests should cover before asking LLM to tell me (it does come up with some great ideas, but it also doesn't cover all the aspects I find important).
Great tool, but it is very easy to be led astray if you are not careful.
Every engineering org should be pleading devs to not let AI write tests. They're awful and sometimes they literally don't even assert the code that was generated and instead assert the code in tests.
Hi i5heu. Given that you seem to use AI tools for generating images and audio versions of your posts, I hope it is not too rude to ask: how much of the post was drafted, written or edited with AI?
The suggestions you make are all sensible but maybe a little bit generic and obvious. Asking ChatGPT to generate advice on effectively writing quality code with AI generates a lot of similar suggestions (albeit less well written).
If this was written with help of AI, I'd personally appreciate a small notice above the blog post. If not, I'd suggest to augment the post with practical examples or anecdotal experience. At the moment, the target group seems to be novice programmers rather than the typical HN reader.
First article about writing code with AI i can get behind 100%. Stuff i already do, stuff i've thought about doing, and at ideas i've never thought doing ("Mark code review levels" especially is a _great_ idea)
> Use strict linting and formatting rules to ensure code quality and consistency. This will help you and your AI to find issues early.
I've always advocated for using a linter and consistent formatting. But now I'm not so sure. What's the point? If nobody is going to bother reading the code anymore I feel like linting does not matter. I think in 10 years a software application will be very obfuscated implementation code with thousands of very solidly documented test cases and, much like compiled code, how the underlying implementation code looks or is organized won't really matter
That sounds like the advice of someone who doesn't actually write high-quality code. Perhaps a better title would be "how to get something better than pure slop when letting a chatbot code for you" - and then it's not bad advice I suppose. I would still avoid such code if I can help it at all.
How to effectively write quality code with AI
(heidenstedt.org)222 points by i5heu 12 hours ago | 166 comments
Comments
A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
Maybe it's the kind of work I'm doing, or maybe I just suck, but the code to me is a forcing mechanism into ironing out the details, and I don't get that when I'm writing a specification.
My hierarchy of static analysis looks like this (hierarchy below is Typescript focused but in principle translatable to other languages):
1. Typesafe compiler (tsc)
2. Basic lint rules (eslint)
3. Cyclomatic complexity rules (eslint, sonarjs)
4. Max line length enforcement (via eslint)
5. Max file length enforcement (via eslint)
6. Unused code/export analyser (knip)
7. Code duplication analyser (jscpd)
8. Modularisation enforcement (dependency-cruiser)
9. Custom script to ensure shared/util directories are not over stuffed (built this using dependency-cruiser as a library rather than an exec)
10. Security check (semgrep)
I stitch all the above in a single `pnpm check` command and defined an agent rule to run this before marking task as complete.
Finally, I make sure `pnpm check` is run as part of a pre-commit hook to make sure that the agent has indeed addressed all the issues.
This makes a dramatic improvement in code quality to the point where I'm able to jump in and manually modify the code easily when the LLM slot machine gets stuck every now and then.
(Edit: added mention of pre-commit hook which I missed mention of in initial comment)
By the time you do everything outlined here you’ve basically recreated waterfall and lost all speed advantage. Might as well write the code yourself and just use AI as first-pass peer review on the code you’ve written.
A lot of the things the writer points out also feel like safeguards against the pitfalls of older models.
I do agree with their 12th point. The smaller your task the easier to verify that the model hasn’t lost the plot. It’s better to go fast with smaller updates that can be validated, and the combination of those small updates gives you your final result. That is still agile without going full “specifications document” waterfall.
And in practice, I am happy enough that the LLM helps me to eliminate some toil, but I think you need to know when it is time to fold your cards, and leave the game. I prefer to fix small bugs in the generated code myself, than asking the agent, as it tends to go too far when fixing its own code.
One of the problems with writing detailed specs is it means you understand the problem, but often the problem is not understand - but you learn to understand it through coding and testing.
So where are we now?
Coding agents made me really get something back from the money I pay for my O'Reilly subscription.
So, coding agents are making me a better engineer by giving me time to dive deeper into books instead of having to just read enough to do something that works under time pressure.
The danger is treating the output as a black box. If you skip the review step and just accept whatever it produces, yes, you'll lose proficiency and accumulate debt. But if you stay engaged with the code, reading it as critically as you would a junior dev's PR, you maintain your understanding while moving faster.
The technical debt concern is valid but it's a process problem, not an inherent flaw. We solved "juniors write bad code" with code review, linting, and CI. We can solve "LLMs write inconsistent code" with the same tools - hannofcart's 10-layer static analysis stack is a good example. The LLM lies about passing checks? Pre-commit hook catches it.
Define data structures manually, ask AI to implement specific state changes. So JSON, C .h or other source files of func sigs and put prompts in there. Never tried the Agents.md monolithic definition file approach
Also I demand it stick to a limited set of processing patterns. Usually dynamic, recursive programming techniques and functions. They just make the most sense to my head and using one style I can spot check faster.
I also demand it avoid making up abstractions and stick to mathematical semantics. Unique namespaces are not relevant to software in the AI era. It's all about using unique vectors as keys to values.
Stick to one behavior or type/object definition per file.
Only allow dependencies that are designed as libraries to begin with. There is a ton of documentation to implement a Vulkan pipeline so just do that. Don't import an entire engine like libgodot.
And for my own agent framework I added observation of my local system telemetry via common Linux files and commands. This data feeds back in to be used to generate right-sized sched_ext schedules and leverage bpf for event driven responses.
Am currently experimenting with generation of small models of my own data. A single path of images for example not the entire Pictures directory. Each small model is spun akin to a Docker container.
LLMs are monolithic (massive) zip files of the entire web. No one really asking for that. And anyone who needs it already has access to the web itself
https://xcancel.com/hamptonism/status/2019434933178306971
And all that after stealing everyone's output.
1. Keep things small and review everything AI written, or 2. Keep things bloated and let AI do whatever it wants within the designated interface.
Initially I drew this line for API service / UI components, but it later expanded to other domains. e.g. For my hobby rust project I try to keep "trait"s to be single responsible, never overlap, easy to understand etc etc. but I never look at AI generated "impl"s as long as it passes some sensible tests and conforming the traits.
The controlling systems are not give it more words at the start. Agentic coding needs to work in loop with dedicated context.
You need to think about how can i give as much intent as possible with as little words.
You can built a tremendous amount of custom lint rules ai never needs to read except they miss it.
Every pattern in your repo gets repeated, repo will always win over documentation and when your repo is good structured you don’t need to repeat this to AI
It’s like dev always has been, watch what has gone wrong and make sure the whole type or error can’t happen again.
https://github.com/ryanthedev/code-foundations
I’m currently working on a checklist and profile based code review system.
Long as you review the code and correct it, it is no more different than using stackoverflow. A stack overflow that reads your code and helps stitch the context.
https://github.com/glittercowboy/get-shit-done
You still need to know the hard parts: precisely what you want to build, all domain/business knowledge questions solved, but this tool automates the rest of the coding and documentation and testing.
It's going to be a wild future for software development...
I do allow it to write the tests (lots of typing there), but I break them manually to see how they fail. And I do think about what the tests should cover before asking LLM to tell me (it does come up with some great ideas, but it also doesn't cover all the aspects I find important).
Great tool, but it is very easy to be led astray if you are not careful.
1. Have the LLM write code based on a clear prompt with limited scope 2. Look at the diff and fix everything it got wrong
That's it. I don't gain a lot in velocity, maybe 10-20%, but I've seen the code, and I know it's good.
The suggestions you make are all sensible but maybe a little bit generic and obvious. Asking ChatGPT to generate advice on effectively writing quality code with AI generates a lot of similar suggestions (albeit less well written).
If this was written with help of AI, I'd personally appreciate a small notice above the blog post. If not, I'd suggest to augment the post with practical examples or anecdotal experience. At the moment, the target group seems to be novice programmers rather than the typical HN reader.
I've always advocated for using a linter and consistent formatting. But now I'm not so sure. What's the point? If nobody is going to bother reading the code anymore I feel like linting does not matter. I think in 10 years a software application will be very obfuscated implementation code with thousands of very solidly documented test cases and, much like compiled code, how the underlying implementation code looks or is organized won't really matter