So, you can assign github issues to this thing, and it can handle them, merge the results in, and mark the bug as fixed?
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
I was interested. Clicked the try button and just another wait list. When will Google learn that the method that worked so well with Gmail doesn't work any more. There are so many shiny toys to play with now, I will have forgotten about this tomorrow.
I decided to be an engineer as opposed to manager because I didn't like people management. Now it looks like I'm forced to manage robots that talk like people. At least I can be the as non-empathetic as I want to be. Unless a startup starts doing HR for AI agents then I'm screwed.
Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
The copy though: "Spend your time doing what you want to do!" followed by images of play video games (I presume), ride a bicycle, read a book, and play table tennis.
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
Both Google and Microsoft have sensibly decided to focus on low-level, junior automation first rather than bespoke end-to-end systems. Not exactly breadth over depth, but rather reliability over capability. Several benefits from the agent development perspective:
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
Notice how no-one (up until now) mentioned "Devin" or compared it to any other AI agent?
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
1. Devin was $200 - $500.
2. Then Lovable, Bolt, Github Copilot and Replit reduced their AI Agent prices to $20 - $40
3. Devin was then reduced to $20.
4. Then Cursor and Windsurf AI agents started at $18 - $20.
5. Afterwards, we also have Claude Code and OpenAI Codex Agents starting at around $20.
6. Then we have Github Copilot Agents embedded directly into GitHub and VS Code for just $0 - $10.
Now we have Jules from Google which is....$0 (Free)
Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
Wow, it looks like Google and Microsoft timed their announcements for the same day, or perhaps one of them rushed their launch because the other company announced sooner than expected. These are exciting times!
These coding agents are coming out so fast I literally don't have time to compare them to each other. They all look great, but keeping up with this would be its own full time job. Maybe that's the next agent.
> Also, you can get caught up fast. Jules creates an audio summary of the changes.
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
Now that every company has a bot, I wish we had some way to better quantify the features.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
I really want to try out Google's new Gemini 2.5 Pro model that everyone says is so great at coding. However, the fact that Jules runs in cloud-based VMs instead of on my local machine makes it much less useful to me than Claude Code, even if the model was better.
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
> Jules creates a PR of the changes. Approve the PR, merge it to your branch, and publish it on GitHub.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change.
What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
Just my two cents but I had a persistent issue with this webapp, tried probably 50 diff prompts to fix it across o3, 2.5 Pro, 3.7 to zero avail. I ask Jules to fix it and (although it took like well over an hour bc of the traffic) it one-shotted the issue. Feels like this is the next step in "thinking" with large enough repos. I like it.
Glad to see they're joining the game, there is so much work to do here. Have been using Gemini 2.5 pro as an autonomous coding agent for a while because it is free. Their work with AlphaEvolve is also pushing the edge - I did a small write up on AlphaEvolve with agentic workflow here: https://toolkami.com/alphaevolve-toolkami-style/
I am really looking forward to “version bumps” without breaking the dependency tree at the very least, something which Dependabot almost gets right.
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
Is the "asynchronous" bit important? How long does it take to do its thing?
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
There doesn't appear to be a way to add files like .npmrc or .env that are not part of what gets pushed to GitHub, making this largely useless for most of my projects
This dev automation tech seems to be targeting the junior dev market and lead to ever fewer junior dev roles. Less junior dev roles means less senior devs. For all the code smart folks that live here, I find very little critical thinking regarding the consequences of this tech for the dev market and the industry in general. No, it's not take your job. And no, just because it doesn't affect you now does not mean that it won't be bad for you in the near future. Do you want to spend your career BUILDING cool stuff or FIXING and REVIEWING AI codebases?
Jules: An asynchronous coding agent
(jules.google)525 points by travisennis 19 May 2025 | 232 comments
Comments
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
https://jules-documentation.web.app/faq
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
Now we have Jules from Google which is....$0 (Free)Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
https://github.blog/changelog/2025-05-19-github-copilot-codi...
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change. What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
That’s the trajectory. Let’s stay sharp.
proceeds to list ALL coding tasks.
There are a million places to do dev that aren’t Microsoft, but you’d never know it from looking at app launches.
It’s almost like people who don’t use GitHub and Gmail and Instagram are becoming second class citizens on the web.
Why would I ever want this over cursor? The sync thing is kinda cool but I basically already do this with cursor
Codex and codex cli are the best from what I have tested so far. Codex is really neat as I can do it from ChatGPT app.
Well here's to hoping it's better than Cursor. I doubt it considering my experiences with Gemini have been awful, but I'm willing to give it a shot!
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.