I just published an extensive review of the new feature, which is actually Claude Code Interpreter (the official name, bafflingly, is Upgraded file creation and analysis - that's what you turn on in the features page at least).
I reverse-engineered it a bit, figured out its container specs, used it to render a PDF join diagram for a SQLite database and then re-ran a much more complex "recreate this chart from this screenshot and XLSX file" example that I previously ran against ChatGPT Code Interpreter last night.
It looks to me like a variant of the Code Interpreter pattern, where Claude has a (presumably sandboxed) server-side container environment in which it can run Python. When you ask it to make a spreadsheet it runs this:
What's weird is that when you enable it in https://claude.ai/settings/features it automatically disables the old Analysis tool - which used JavaScript running in your browser. For some reason you can have one of those enabled but not both.
The new feature is being described exclusively as a system for creating files though! I'm trying to figure out if that gets used for code analysis too now, in place of the analysis tool.
Anyone else having serious reliability issues with artifact editing? I find that the artifacts quite often get "stuck", where the LLM is trying to edit the artifact but the state of the artifact does not change. Seems like the LLM is somehow failing in editing the artifact silently, while thinking that it is actually doing the edits. The way to resolve this is to ask Claude to make a new artifact, which then has all the changes Claude thought it was making. But you have to do this relatively often.
My experience is similar. At first Claude was super smart and get even very complicated things right. Now even super simple tasks are almost impossible to finish right, even if I really chop things into small steps. Also it's much slower even on Pro account than a few weeks ago.
For the past two to three weeks I've noticed Claude just consistently lagging or potentially even being throttled for pretty minor coding or CLI tasks. It'll basically stop showing any progress for at least a couple minutes. Sometimes exiting the query and re-trying gets it to work but other times it keeps happening. I pay for Pro so I don't think it's just API rate limiting.
Would appreciate if that could be fixed but of course new features are more interesting for them to prioritize.
To everyone who has been feeling like their MAX subscription is a waste of money, give GLM 4.5 a try, i use it with claude code daily on the $3 plan and it has been great
Oh, nice! One of my biggest issues with mainstream LLMs/apps was that working on the long text (article, script, documentation, etc.) is limited to copy-pasting dance. Which is especially frustrating in comparison to the AI coding assistants that can work on code directly in the file system, using the internet and MCPs at the same time.
I just tried this new feature to work on a text document in a project, and it's a big difference. Now I really want to have this feature (for text at least) in ChatGPT to be able to work on documents through voice and without looking at the screen.
Does Microsoft Copilot do this already? Isn't it integrated into Windows and MSFT Office products? Has it been working out for Copilot? Is it helpful? Adoption rates of AI are interesting to say the least.
I don't have access to this yet, so can someone who does tell if:
it can take a .PDF with single table with, say, a list of food items and prices. And then in a .docx in the same folder with a table with, say, prices and calories. Can this thing then, in a one shot, produce a .xlsx with the items and calories and save that to the same directory? It really doesn't matter what the lists are of, just keep it very simple A=B, B=C, therefore A=C stuff.
Because, strangely enough, that's pretty much my definition of AGI.
This will either result in a lot of people being able to sleep more, or an absolute avalanche of crap is about to be released upon society.
A lot of the people I graduated with spent their 20s making powerpoint and excel. There would be people with a master's in engineering getting phone calls at 1am, with an instruction to change the fonts on slide 75, or to slightly modify some calculation. Most of the real decision making was, funnily enough, not based on these documents. But it still meant people were working 100 hour weeks.
I could see this resulting in the same work being done in a few minutes. But I could also see it resulting in the MDs asking for 10x the number of slide decks.
The security concerns here are really significant. In the section[1] on security, they write "we recommender you monitor Claude while using this feature." This borders on irresponsible IMO. Monitor what exactly? How should we monitor? What logs and metrics are exposed for security monitoring? How would a user recognize suspicious patterns...?
A linux desktop version of claude would be great, given that it's basically just a Tauri app, it should be pretty trivial...
You could even ask claude code with scopecraft/cmd to plan it all out and implement this.
For anthropic, the excuse that there's not enough time to implement this is a pretty glaring admission about the state and success of AI assisted development.
‘…now has access to a server-side container environment’
Headline demonstrates why SWEa don’t have to worry about vive coders eating their lunch. Vibe-coders don’t know what a container is, nor why it would be good for it to be in the context of an environment (what’s an environment?), or be server-side for that matter. Now if there were a course that instructed all this kind of architectural tradecraft that isn’t taught in university CS courses (but bootcamps..?), then adding vibe-coding alongside might pose a concern m, at least till the debugging technical debt comes due. But by then the vibecoder had validated a new market on the back of their v0, so thank them for the fresh revenue streams.
I tested this feature out today, applying the same prompt and CSV data to both Claude Opus 4.1 and GPT-5-Thinking. They both chugged away writing Pandas code and produced similar output. It's nice to have another option for data analysis to act as a second opinion on GPT, if nothing else.
The real money is in collecting and reselling user data, so if Joe gives his recent finances to "Anthropic" in order to plan a trip to Italy (this is one of the examples in the submission), perhaps credit rating agencies would like a copy.
Finally they figure out that there is no money or interest in code-plagiarizing apps!
I noticed the other day that chatgpt started preferring to provide me with a download link for code rather than putting it up in canvas. It also started offering me diffs, but as I just write fairly basic data munging scripts for neuroimaging analyses, I don't like to dive too deep into the coding tool boxes/chains...copy paste is easy...although, I would like versioning without making copies of my script for backup
Not Claude specific, but related to the agent model of things...
I've been paying $10/month for GitHub Copilot, which I use via Microsoft's Visual Studio Code, and about a month ago, they added ChatGPT5 (preview), which uses the agent model of interaction. It's a qualitative jump that I'm still learning to appreciate in full.
It seems like the worst possible thing, in terms of security, to let an LLM play with your stuff, but I really didn't understand just how much easier it could be to work with an LLM if it's an agent. Previously I'd end up with a blizzard of python error messages, and just give up on a project, now it fixes it's own mess. What a relief!
Is it able to process a prompt on each file in a folder-full of files and then return the collated results?
That's the functionality which I could use for my day job, but I'm not finding an LLM which directly affords that capability (without programming or other steps which are difficult on my work computer).
Wasn't this already doable? Via instructing the llm to output as PDF xml or PowerPoint markup etc and writing (with AI assistance) the glue layer. It's not nothing but also not that difficult. I don't see how Claude's version of this can be much better
If the final Claude goal is to remove human from the process (IA can do everything), what's the point of having these files? If they are going to be feed again to a model to interpret them, wouldn't be better to use something simpler/easier to parse?
A smell of changing strategy? Claude has been the favourite of engineers and it seems it’s now trying to win back the general consumer market where ChatGPT has taken the majority. But at the cost of Claude code? Codex is like a shark chasing CC nowadays.
They need to focus on fixing reliability first. Their systems constantly go down and it appears they are having to quantise the models to keep up with demand, reducing intelligence significantly. New features like this feel pointless when the underlying model is becoming unusable.
Now we see where these ai foundation companies are heading. They are literally building the next operating system to replace the old gatekeepers, similarly like netscape tried to do with microsoft in the 90's
Wow, that's like...a huge deal. It's a major feat of engineering when some software can create and edit files. That's like half of CRUD! Seems like they are really advanced, like magic!
Claude now has access to a server-side container environment
(anthropic.com)654 points by meetpateltech 9 September 2025 | 342 comments
Comments
I reverse-engineered it a bit, figured out its container specs, used it to render a PDF join diagram for a SQLite database and then re-ran a much more complex "recreate this chart from this screenshot and XLSX file" example that I previously ran against ChatGPT Code Interpreter last night.
Here's my review: https://simonwillison.net/2025/Sep/9/claude-code-interpreter...
It looks to me like a variant of the Code Interpreter pattern, where Claude has a (presumably sandboxed) server-side container environment in which it can run Python. When you ask it to make a spreadsheet it runs this:
And then generates and runs a Python script.What's weird is that when you enable it in https://claude.ai/settings/features it automatically disables the old Analysis tool - which used JavaScript running in your browser. For some reason you can have one of those enabled but not both.
The new feature is being described exclusively as a system for creating files though! I'm trying to figure out if that gets used for code analysis too now, in place of the analysis tool.
Would appreciate if that could be fixed but of course new features are more interesting for them to prioritize.
It can actually drive emacs itself, creating buffers, being told not to edit the buffers and simply respond in the chat etc.
I actually _like_ working with efrit vs other LLM integrations in editors.
In fact I kind of need to have my anthropic console up to watch my usage... whoops!
I just tried this new feature to work on a text document in a project, and it's a big difference. Now I really want to have this feature (for text at least) in ChatGPT to be able to work on documents through voice and without looking at the screen.
it can take a .PDF with single table with, say, a list of food items and prices. And then in a .docx in the same folder with a table with, say, prices and calories. Can this thing then, in a one shot, produce a .xlsx with the items and calories and save that to the same directory? It really doesn't matter what the lists are of, just keep it very simple A=B, B=C, therefore A=C stuff.
Because, strangely enough, that's pretty much my definition of AGI.
At the start of summer you could still ask for any kind of file as an artifact and they would produce it and you could download it.
They they changed it to artifacts were only ever seen pages that you could share or view in the app.
Yes this is going to transform how I use Claude... BACK to the way I used it in June!
As a user this post is frustrating as hell to read because I've missed this feature so much, but at the same time thanks for giving it back I guess?
A lot of the people I graduated with spent their 20s making powerpoint and excel. There would be people with a master's in engineering getting phone calls at 1am, with an instruction to change the fonts on slide 75, or to slightly modify some calculation. Most of the real decision making was, funnily enough, not based on these documents. But it still meant people were working 100 hour weeks.
I could see this resulting in the same work being done in a few minutes. But I could also see it resulting in the MDs asking for 10x the number of slide decks.
You could even ask claude code with scopecraft/cmd to plan it all out and implement this.
For anthropic, the excuse that there's not enough time to implement this is a pretty glaring admission about the state and success of AI assisted development.
Headline demonstrates why SWEa don’t have to worry about vive coders eating their lunch. Vibe-coders don’t know what a container is, nor why it would be good for it to be in the context of an environment (what’s an environment?), or be server-side for that matter. Now if there were a course that instructed all this kind of architectural tradecraft that isn’t taught in university CS courses (but bootcamps..?), then adding vibe-coding alongside might pose a concern m, at least till the debugging technical debt comes due. But by then the vibecoder had validated a new market on the back of their v0, so thank them for the fresh revenue streams.
all SaaS projects building on it to resell functionality will go away because there will be no point to pay the added costs.
Finally they figure out that there is no money or interest in code-plagiarizing apps!
I'm on 100$ Max plan, I would even buy 2x 200$ plan if Opus would stop randomly being dumb. Especially after 7am ET time.
I've been paying $10/month for GitHub Copilot, which I use via Microsoft's Visual Studio Code, and about a month ago, they added ChatGPT5 (preview), which uses the agent model of interaction. It's a qualitative jump that I'm still learning to appreciate in full.
It seems like the worst possible thing, in terms of security, to let an LLM play with your stuff, but I really didn't understand just how much easier it could be to work with an LLM if it's an agent. Previously I'd end up with a blizzard of python error messages, and just give up on a project, now it fixes it's own mess. What a relief!
That's the functionality which I could use for my day job, but I'm not finding an LLM which directly affords that capability (without programming or other steps which are difficult on my work computer).
Hope not.
Something with OAuth authentication.
Our org isn't interested in running a local, unofficial MCP server and having users create their own API keys.
ChatGPT can package up files as a download.
Both Gemini and ChatGPT accept zip files with lots of files in them.
Claude does neither of those things.
Malware writers are rejoicing!