This is the "lines of code per week" metric from the 90s, repackaged. "I'm doing more PRs" is not evidence that AI is working, it's evidence that you are merging more.
Whether thats good depends entirely on what you are merging.
I use AI every day too. But treating throughput of code going to production as a success metric, without any mention of quality, bugs, or maintenance burden is exactly the kind of thinking developers used to push back on when management proposed it.
Turns out we weren't opposed to bad metrics! We were just opposed to being measured!
Given the chance to pick our own, we jumped straight to the same nonsense.
I think more people should focus on using LLMs to relieve cognitive load rather than parallelize and overload their brains. We need to learn to live with the fact that humans are not good at multi-tasking, and LLMs are not going to make us better at it.
I have started using Claude to develop an implementation plan, but instead of making Claude implement it and then have me spend time figuring out what it did, I simply tell it to walk me through implementing it by hand. This means that I actually understand every step of the development process and get to intervene and make different choices at the point of time where it matters. As opposed to the default mode which spits out hundreds of lines of code changes which overloads my brain, this mode of working actually feels like offloading the cognitive burden of keeping track of the implementation plan and letting me focus on both the details and the big picture without losing track of either one. For truly mechanical sub-tasks I can still save time by asking Claude to do them for me.
>The time saved matters, but the real unlock was the mental overhead removed. Every PR used to be a small context switch: stop thinking about the code, start thinking about how to describe the code. Now I type /git-pr and move on to the next thing.
This one's interesting to me. For a lot of my career, the act of writing the PR is the last sanity check that surfaces any weirdness or my own misgivings about my choices. Sometimes there would be code that felt natural when I was writing it and getting the feature working, and maybe that code survived my own personal round of code review... but having to write about it in plain english for the benefit of someone doing review with less context was a useful spot to do some self-reflection.
Honest question: if you're using multiple agents, it's usually to produce not a dozen lines of code. It's to produce a big enough feature spanning multiple files, modules and entry points, with tests and all. So far so good. But once that feature is written by the agents... wouldn't you review it? Like reading line by line what's going on and detecting if something is off? And wouldn't that part, the manual reviewing, take an enormous amount of time compare to the time it took the agents to produce it? (you know, it's more difficult to read other people's/machine code than to write it yourself)... meaning all the productivity gained is thrown out the door.
Unless you don't review every generated line manually, and instead rely on, let's say, UI e2e testing, or perhaps unit testing (that the agents also wrote). I don't know, perhaps we are past the phase of "double check what agents write" and are now in the phase of "ship it. if it breaks, let agents fix it, no manual debugging needed!" ?
> The worktree system removed the friction of context-switching - juggling multiple streams of work without them colliding.
I'm so conflicted about this. On the one hand I love the buzz of feeling so productive and working on many different threads. On the other hand my brain gets so fried, and I think this is a big contributor.
> I’m not “using a tool that writes code.” I’m in a tight loop: kick off a task, the agent writes code, I check the preview, read the diff, give feedback or merge, kick off the next task
the assumption to this workflow is that claude code can complete tasks with little or no oversight.
If the flow looks like review->accept, review->accept, it is manageable.
In my personal experience, claude needs heavy guidance and multiple rounds of feedback before arriving at a mergeable solution (if it does at all).
Interleaving many long running tasks with multiple rounds of feedback does not scale well unfortunately.
I can only remember so much, and at some point I spend more time trying to understand what has been done so far to give accurate feedback than actually giving feedback for the next iteration.
Maybe OT - I find Claude Code hit or miss, I spend a lot of time removing dumb code or asking Claude to remove it eg "why do you have a separate..." Claude: "Good catch — there's no real reason...." and so on.
Where I find it incredible - learning new things, I recently started flutter/dart dev - I just ask Claude to tell me about the bits, or explaining things to me, it's truly revolutionary imho, I'm building things in flutter after a week without reading a book or manual. It's like a talking encyclopaedia, or having an expert on tap, do many people use it like this? or am I just out of the loop, I always think of Star Trek when I'm doing it. I architected / designed a new system by asking Claude for alternatives and it gave me an option I'd never considered to a problem, it's amazing for this, after all it's read all the books and manuals in the world, it's just a matter of asking the right questions.
I don't understand the "being more productive" part. Like, sure, LLMs make us iterate faster but our managers know we're using them! They don't naively think we suddenly became 10x engineers. Companies pay for these tools and every engineer has access to them. So if everyone is equally productive, the baseline just shifted up... same as always, no?
Mentioning LLM usage as a distinction is like bragging about using a modern compiler instead of writing assembly. Yeah it's faster, but so is everyone else code...
Besides, I wouldn't brag about being more productive with LLMS because it's a double edge sword: it's very easy to use them, and nobody is reviewing all the lines of code you are pushing to prod (really, when was the last time you reviewed a PR generated by AI that changed 20+ files and added/removed thousands of lines of code?), so you don't know what's the long game of your changes; they seem to work now but who knows how it will turn out later?
This is basically the same workflow I've come to adopt. I don't use any "pre-built" skills, mine are actually still .md files in the .claude/command/ folder because that's when I started. The workflow is so good, I'm the bottleneck.
I've started to use git worktrees to parallelize my work. I spend so much time waiting...why not wait less on 2 things? This is not a solved problem in my setup. I have a hard time managing just two agents and keeping them isolated. But again, I'm the bottleneck. I think I could use 5 agents if my brain were smarter........or if the tools were better.
I am also a PM by day and I'm in Claude Code for PM work almost 90% of my day.
> What’s become more fun is building the infrastructure that makes the agents effective.
Solving new problems is a thing engineers get to do constantly, whereas building an agent infrastructure is mostly a one-ish time thing. Yes, it evolves, but I worry that once the fun of building an agentic engineering system is done, we’re stuck doing arguably the most tedious job in the SDLC, reviewing code. It’s like if you were a principal researcher who stopped doing research and instead only peer reviewed other people’s papers.
The silver lining is if the feeling of faster progress through these AI tools gives enough satisfaction to replace the missing satisfaction of problem-solving. Different people will derive different levels of contentment from this. For me, it has not been an obvious upgrade in satisfaction. I’m definitely spending less time in flow.
The amount of code changes I find acceptable, to simplify and shrink my code base, is now almost unbounded.
Overstating things of course. But paying off technical debt never felt so good. And the expected decrease in forward friction has never been so achievable so quickly.
Hello! OP here, a lot of comments have this common theme of wondering if this is overloading / context switching / the brain thrashing.
Helped me surface an important distinction on why it doesn't really happen for me. I think there's three parts to it:
1. I work on only one thing at a time, and try to keep chunks meaty
2. I make sure my agents can run a lot longer so every meaty chunk gets the time it deserves, and I'm not babysitting every change in parallel, that would be horrible! (how I do this is what this post focuses on)
3. New small items that keep coming up / bug fixes get their own thread in the middle of the flow when they do come up, so I can fire and forget, come back to it when I have time. This works better for me because I'm not also thinking about these X other bugs that are pending, and I can focus on what I'm currently doing.
What I had to figure out was how to adapt this workflow to my strengths (I love reviewing code and working on one thing at a time, but also get distracted easily). For my trade-offs, it was ideal to offload context to agents whenever a new thing pops up, so I continue focusing on my main task.
The # of PRs might look huge (and they are to me), but I'm focusing on one big chonky thing a day, the others are smaller things, which together mean progress on my product is much faster than it otherwise would be.
I have a little ai-commit.sh as "send" in package.json which describes my changes and commits. Formatting has been solved by linters already. Neither my approach nor OP approach are ground-breaking, but i think mine is faster, you also !p send (p alias pnpm) inside from claude no need for it to make a skill and create overhead..
Like thinking about it a pr skill is pretty much an antipattern even telling ai to just create a pr is faster.
I think some vibe coders should let AI teach them some cli tooling
Oh look someone over glazing AI and its usefulness. I hope this is a real person authentically sharing their opinion and not some AI startup guerrilla marketing.
if you can't be bothered to write your own PR descriptions because it's drudgery, how can you expect others to read your (now-lengthier-because-AI) PR descriptions?
This is an honest as someone who is also now doing this.
I don't know if I am just in an unlucky A/B assignment or anything but I really don't understand people juggling multiple agent sessions. For me Opus 4.6 High performance went from unbelievable to mediocre. And this keeps happening making the whole agentic coding very unreliable and frustrating. I do use it but I have to babysit and I get overwhelmed even with a single session.
I'm very sceptical on how well AI can "read the full diff and summarise the changes properly".
A colleague has been using Claude for this exact purpose for the past 2-3 months. Left alone, Claude just kept spewing spammy, formulaic, uninteresting summaries. E.g. phrases like "updated migrations" or "updated admin" were frequent occurrences for changes in our Django project. On the other hand, important implementation choices were left undocumented.
Basically, my conclusion was that, for the time being, Claude's summaries aren't worthy for inclusion in our git log. They missed most things that would make the log message useful, and included mostly stuff that Claude could generate on demand at any time. I.e. spam.
Ah, another pro-AI coding post written by someone whose livelihood depends on promoting/selling AI-assisted coding products. Color me shocked. And they used AI to write the post itself.
As an outsider it seems like agentic coders get buried in the weeds of running agents in parallel and churning out commits. (Even after a sheepish “commits are a bad metric but”) And every week there is a new orchestration, something, who even cares.
Is that the end game? Well why can’t the agents orchestrate the agents? Agents all the way down?
The whole agent coding scene seems like people selling their soul for very shiny inflatable balloons. Now you have twelve bespoke apps tailored for you that you don’t even care about.
> The PR descriptions are more thorough than what I’d write
Why do people do this? Why do they outsource something that is meant to have been written by a human, so that another human can actually understand what that first human wanted to do, so why do people outsource that to AI? It just doesn't make sense.
I've been doing a lot of parallel work and it can be draining. It feels exciting to have 6 agents spinning on things, but unless you have very well scoped plans, you need to still check in frequently.
If you have the tokens for it, having a team of agents checking and improving on the work does help a lot and reduces the slop.
> /git-pr removed the friction of formatting - turning code changes into a presentable PR.
What I want from a PR is what's not in the patch, especially the end goal of the PR, or the reasoning for the solution represented by the changes.
> SWC removed the friction of waiting - the dead time between making a change and seeing it.
Not sure how that relates to Claude Code.
> The preview removed the friction of verifying changes - I could quickly see what’s happening.
How Claude is "verifying" UI changes is left very vague in the article.
> The worktree system removed the friction of context-switching - juggling multiple streams of work without them colliding.
Ultimately, there's only one (or two) main branches. All those changes needs to be merged back together again and they needs to be reviewed. Not sure how collisions and conflicts is miraculously solved.
> The PR descriptions are more thorough than what I’d write, because it reads the full diff and summarises the changes properly. I’d gotten so used to the drudgery that I’d stopped noticing it was drudgery.
Who are you creating PR descriptions for, exactly? If you consider it "drudgery", how do you think your coworkers will feel having to read pages of generic "AI" text? If reviewing can be considered "drudgery" as well, can we also offload that to "AI"? In which case, why even bother with PRs at all? Why are you still participating in a ceremony that was useful for humans to share knowledge and improve the codebase, when machines don't need any of it?
> My role has changed. I used to derive joy from figuring out a complicated problem, spending hours crafting the perfect UI. [...] What’s become more fun is building the infrastructure that makes the agents effective. Being a manager of a team of ten versus being a solo dev.
Yeah, it's great that you enjoy being a "manager" now. Personally, that is not what I enjoy doing, nor why I joined this industry.
Quick question: do you think your manager role is safe from being automated away? If machines can write code and prose now better than you, couldn't they also manage other machines into producing useful output better than you? So which role is left for you, and would you enjoy doing it if "manager" is not available?
Purely rhetorical, of course, since I don't think the base premise is true, besides the fact that it's ignoring important factors in software development such as quality, reliability, maintainability, etc. This idea that the role of an IC has now shifted into management is amusing. It sounds like a coping mechanism for people to prove that they can still provide value while facing redundancy.
So many pretend they are more productive but so few are able to articulate what they actually produced.
Some says features. Well. Are they used. Are they beneficial in any way for our society or humanity? Or are we junk producing for the sake of producing?
How I'm Productive with Claude Code
(neilkakkar.com)253 points by neilkakkar 22 hours ago | 156 comments
Comments
Turns out we weren't opposed to bad metrics! We were just opposed to being measured! Given the chance to pick our own, we jumped straight to the same nonsense.
I have started using Claude to develop an implementation plan, but instead of making Claude implement it and then have me spend time figuring out what it did, I simply tell it to walk me through implementing it by hand. This means that I actually understand every step of the development process and get to intervene and make different choices at the point of time where it matters. As opposed to the default mode which spits out hundreds of lines of code changes which overloads my brain, this mode of working actually feels like offloading the cognitive burden of keeping track of the implementation plan and letting me focus on both the details and the big picture without losing track of either one. For truly mechanical sub-tasks I can still save time by asking Claude to do them for me.
This one's interesting to me. For a lot of my career, the act of writing the PR is the last sanity check that surfaces any weirdness or my own misgivings about my choices. Sometimes there would be code that felt natural when I was writing it and getting the feature working, and maybe that code survived my own personal round of code review... but having to write about it in plain english for the benefit of someone doing review with less context was a useful spot to do some self-reflection.
Unless you don't review every generated line manually, and instead rely on, let's say, UI e2e testing, or perhaps unit testing (that the agents also wrote). I don't know, perhaps we are past the phase of "double check what agents write" and are now in the phase of "ship it. if it breaks, let agents fix it, no manual debugging needed!" ?
I'm so conflicted about this. On the one hand I love the buzz of feeling so productive and working on many different threads. On the other hand my brain gets so fried, and I think this is a big contributor.
the assumption to this workflow is that claude code can complete tasks with little or no oversight.
If the flow looks like review->accept, review->accept, it is manageable.
In my personal experience, claude needs heavy guidance and multiple rounds of feedback before arriving at a mergeable solution (if it does at all).
Interleaving many long running tasks with multiple rounds of feedback does not scale well unfortunately.
I can only remember so much, and at some point I spend more time trying to understand what has been done so far to give accurate feedback than actually giving feedback for the next iteration.
Where I find it incredible - learning new things, I recently started flutter/dart dev - I just ask Claude to tell me about the bits, or explaining things to me, it's truly revolutionary imho, I'm building things in flutter after a week without reading a book or manual. It's like a talking encyclopaedia, or having an expert on tap, do many people use it like this? or am I just out of the loop, I always think of Star Trek when I'm doing it. I architected / designed a new system by asking Claude for alternatives and it gave me an option I'd never considered to a problem, it's amazing for this, after all it's read all the books and manuals in the world, it's just a matter of asking the right questions.
> I switched the build to SWC, and server restarts dropped to under a second.
What is SWC? The blog assumes I know it. Is it https://swc.rs/ ? or this https://docs.nestjs.com/recipes/swc ?
Mentioning LLM usage as a distinction is like bragging about using a modern compiler instead of writing assembly. Yeah it's faster, but so is everyone else code... Besides, I wouldn't brag about being more productive with LLMS because it's a double edge sword: it's very easy to use them, and nobody is reviewing all the lines of code you are pushing to prod (really, when was the last time you reviewed a PR generated by AI that changed 20+ files and added/removed thousands of lines of code?), so you don't know what's the long game of your changes; they seem to work now but who knows how it will turn out later?
but a chart of commits/contribs is such a lousy metric for productivity.
It's about on par with the ridiculousness of LOC implying code quality.
I've started to use git worktrees to parallelize my work. I spend so much time waiting...why not wait less on 2 things? This is not a solved problem in my setup. I have a hard time managing just two agents and keeping them isolated. But again, I'm the bottleneck. I think I could use 5 agents if my brain were smarter........or if the tools were better.
I am also a PM by day and I'm in Claude Code for PM work almost 90% of my day.
Solving new problems is a thing engineers get to do constantly, whereas building an agent infrastructure is mostly a one-ish time thing. Yes, it evolves, but I worry that once the fun of building an agentic engineering system is done, we’re stuck doing arguably the most tedious job in the SDLC, reviewing code. It’s like if you were a principal researcher who stopped doing research and instead only peer reviewed other people’s papers.
The silver lining is if the feeling of faster progress through these AI tools gives enough satisfaction to replace the missing satisfaction of problem-solving. Different people will derive different levels of contentment from this. For me, it has not been an obvious upgrade in satisfaction. I’m definitely spending less time in flow.
Overstating things of course. But paying off technical debt never felt so good. And the expected decrease in forward friction has never been so achievable so quickly.
Helped me surface an important distinction on why it doesn't really happen for me. I think there's three parts to it:
1. I work on only one thing at a time, and try to keep chunks meaty
2. I make sure my agents can run a lot longer so every meaty chunk gets the time it deserves, and I'm not babysitting every change in parallel, that would be horrible! (how I do this is what this post focuses on)
3. New small items that keep coming up / bug fixes get their own thread in the middle of the flow when they do come up, so I can fire and forget, come back to it when I have time. This works better for me because I'm not also thinking about these X other bugs that are pending, and I can focus on what I'm currently doing.
What I had to figure out was how to adapt this workflow to my strengths (I love reviewing code and working on one thing at a time, but also get distracted easily). For my trade-offs, it was ideal to offload context to agents whenever a new thing pops up, so I continue focusing on my main task.
The # of PRs might look huge (and they are to me), but I'm focusing on one big chonky thing a day, the others are smaller things, which together mean progress on my product is much faster than it otherwise would be.
Is that how it works? Do managers claim credit for the work of those below them, despite not doing the work?
I hope they also get penalised when a lowly worker does a bad thing, even if the worker is an LLM silently misinterpreting a vague instruction.
Like thinking about it a pr skill is pretty much an antipattern even telling ai to just create a pr is faster.
I think some vibe coders should let AI teach them some cli tooling
This is an honest as someone who is also now doing this.
A colleague has been using Claude for this exact purpose for the past 2-3 months. Left alone, Claude just kept spewing spammy, formulaic, uninteresting summaries. E.g. phrases like "updated migrations" or "updated admin" were frequent occurrences for changes in our Django project. On the other hand, important implementation choices were left undocumented.
Basically, my conclusion was that, for the time being, Claude's summaries aren't worthy for inclusion in our git log. They missed most things that would make the log message useful, and included mostly stuff that Claude could generate on demand at any time. I.e. spam.
However, I agree with you that commits are a terrible (or an unreliable) metric; more commits do not necessarily equal higher productivity.
Meanwhile in the real world the expectations shift to normalise the 10x and your boss wants to know why your output isn’t 12x like that of Max
Oh really? I enjoy doing one thing at the time, with focus.
AI, as you're using it OP, isn't make you faster, it is making you work more for the same amount of money. You burn yourself for no reason.
Is that the end game? Well why can’t the agents orchestrate the agents? Agents all the way down?
The whole agent coding scene seems like people selling their soul for very shiny inflatable balloons. Now you have twelve bespoke apps tailored for you that you don’t even care about.
Why do people do this? Why do they outsource something that is meant to have been written by a human, so that another human can actually understand what that first human wanted to do, so why do people outsource that to AI? It just doesn't make sense.
If you have the tokens for it, having a team of agents checking and improving on the work does help a lot and reduces the slop.
What I want from a PR is what's not in the patch, especially the end goal of the PR, or the reasoning for the solution represented by the changes.
> SWC removed the friction of waiting - the dead time between making a change and seeing it.
Not sure how that relates to Claude Code.
> The preview removed the friction of verifying changes - I could quickly see what’s happening.
How Claude is "verifying" UI changes is left very vague in the article.
> The worktree system removed the friction of context-switching - juggling multiple streams of work without them colliding.
Ultimately, there's only one (or two) main branches. All those changes needs to be merged back together again and they needs to be reviewed. Not sure how collisions and conflicts is miraculously solved.
Who are you creating PR descriptions for, exactly? If you consider it "drudgery", how do you think your coworkers will feel having to read pages of generic "AI" text? If reviewing can be considered "drudgery" as well, can we also offload that to "AI"? In which case, why even bother with PRs at all? Why are you still participating in a ceremony that was useful for humans to share knowledge and improve the codebase, when machines don't need any of it?
> My role has changed. I used to derive joy from figuring out a complicated problem, spending hours crafting the perfect UI. [...] What’s become more fun is building the infrastructure that makes the agents effective. Being a manager of a team of ten versus being a solo dev.
Yeah, it's great that you enjoy being a "manager" now. Personally, that is not what I enjoy doing, nor why I joined this industry.
Quick question: do you think your manager role is safe from being automated away? If machines can write code and prose now better than you, couldn't they also manage other machines into producing useful output better than you? So which role is left for you, and would you enjoy doing it if "manager" is not available?
Purely rhetorical, of course, since I don't think the base premise is true, besides the fact that it's ignoring important factors in software development such as quality, reliability, maintainability, etc. This idea that the role of an IC has now shifted into management is amusing. It sounds like a coping mechanism for people to prove that they can still provide value while facing redundancy.
Some says features. Well. Are they used. Are they beneficial in any way for our society or humanity? Or are we junk producing for the sake of producing?