I have been developing software since the late 80s, mostly CAM software for metal cutting machines, and I have been refereeing tabletop roleplaying games like Dungeons & Dragons since the late 70s.
I get the power of LLMs, and I do find them useful. But I find them useful in much the same way I find a really good set of random tables useful, or a good set of rules for procedurally generating something like a star sector for a science fiction campaign.
For my day job developing software, and for the RPG campaigns and books I run and publish today, LLMs are, in many cases, random tables on steroids. After using them for two years, even with all their improvements, I am continually reminded by the results I get that, at the heart of it, I am still dealing with what amounts to randomly generated content.
Yes, I know it is more accurate to call the process probabilistic rather than random. And yes, somebody can construct a technically deterministic setup with fixed weights, fixed seeds, fixed sampling parameters, and a frozen runtime environment. But that is like saying you can recreate a rainstorm if you get a thousand butterflies to flap their wings in exactly the right way. It may be technically true, but it is not how the technology behaves in normal day-to-day use.
For practical purposes, given the same prompt and the same apparent starting conditions, the result can differ each time you use a model. The outputs will often be highly correlated, and often useful, but they are not deterministic software in the ordinary sense.
So far, I am failing to see how the inherent probabilistic nature of the technology can be fully overcome. I understand how we got to where we are today from older neural net technology, including the systems used for vision and sound. What we have now can be very useful. But my view is that it is being badly oversold and overhyped. Its probabilistic nature is being vastly underestimated, and that is a major reason for much of the weirdness and many of the failures we keep seeing.
In tabletop roleplaying, there have been times when hobbyists relied too much on procedurally generated content and ultimately got burned by it, either through campaigns that were not as fun or products that were subpar. Each time, the lesson was the same: there is no substitute for human judgment.
Any workflow or technology incorporating LLMs has to keep humans in the loop, and not merely as rubber stamps. The human has to remain the primary decision maker.
It started failing two days ago, when it suddenly couldn't access gmail threads reliably. Then it started popping up warnings that I was over quota when I wasn't. It even let me use Fable briefly, or pretended to. Meanwhile search finally started working, so there's that.
Out of desperation, I moved to ChatGPT and it's working better than I remember. All these companies are playing games under load, under failure. No wonder we can't agree on what's good for what.
Actual 90d uptime: 97.6838% (calculated by Codex from live data)
Computed from the page’s own data for 2026-03-26 through 2026-06-23:
- Partial outage: 43h 15m 1s
- Major outage: 6h 46m 48s
- Total affected time: 50h 1m 49s
- Major-only uptime: 99.6861%
I signed up for paid plan on Claude just 3 hours ago for the first time and was scratching my head on how that thing gets praised so much if I can't even send a question half of the time....
I have two sessions going. One is fine, one keeps timing out. Both Opus 4.8 in Claude code in terminal. Must have them routed to different to different infra that isn’t equally impacted.
I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.
— Boris Cherny, head of Claude Code
Reliability is a direct reflection of the quality of the underlying infrastructural code. If even Anthropic, the company with the world's best agentic vibecoders, has horribly unreliable infrastructure, it really says something about the quality of the world's best agentically produced code.
Protip: in the olden days we used to be able to read and write code ourselves. Worth trying while Claude is down! You might have fun and learn something!
Imagine a future where Anthropic holds your company hostage because no one can code properly anymore by hand and demands paying 200% higher price for the usage.
Incredible how we can claim productivity increases when its either Claude or Github shitting the bed every other day. It must even itself out to a net neutral gain in the long term.
Hey you. Touch grass. Go outside. If a minor downtime of a developer tool triggers you, it means you likely have heavy anxiety. Don’t worry about it and calm down.
Anthropic has massive capability issues due to massive user growth. It happens often when EU and US work hours collide. They have smart people working on it. Don’t waste your energy complaining.
It would be hilarious if they don't know how to fix it because this was built by "running loops calling Claude" and they haven't the faintest idea of the present underlying architecture.
I request an official statement from Anthropic explaining how they're going to limit outages in the future. Elevated errors almost always means its down for me and I can't be that unlucky statistically speaking. It seems that Anthropic does not have a good grip on the ops side of things.
Elevated error rate across multiple models
(status.claude.com)188 points by rob 6 hours ago | 241 comments
Comments
[1] https://pi.dev/ [2] https://openrouter.ai/rankings
I get the power of LLMs, and I do find them useful. But I find them useful in much the same way I find a really good set of random tables useful, or a good set of rules for procedurally generating something like a star sector for a science fiction campaign.
For my day job developing software, and for the RPG campaigns and books I run and publish today, LLMs are, in many cases, random tables on steroids. After using them for two years, even with all their improvements, I am continually reminded by the results I get that, at the heart of it, I am still dealing with what amounts to randomly generated content.
Yes, I know it is more accurate to call the process probabilistic rather than random. And yes, somebody can construct a technically deterministic setup with fixed weights, fixed seeds, fixed sampling parameters, and a frozen runtime environment. But that is like saying you can recreate a rainstorm if you get a thousand butterflies to flap their wings in exactly the right way. It may be technically true, but it is not how the technology behaves in normal day-to-day use.
For practical purposes, given the same prompt and the same apparent starting conditions, the result can differ each time you use a model. The outputs will often be highly correlated, and often useful, but they are not deterministic software in the ordinary sense.
So far, I am failing to see how the inherent probabilistic nature of the technology can be fully overcome. I understand how we got to where we are today from older neural net technology, including the systems used for vision and sound. What we have now can be very useful. But my view is that it is being badly oversold and overhyped. Its probabilistic nature is being vastly underestimated, and that is a major reason for much of the weirdness and many of the failures we keep seeing.
In tabletop roleplaying, there have been times when hobbyists relied too much on procedurally generated content and ultimately got burned by it, either through campaigns that were not as fun or products that were subpar. Each time, the lesson was the same: there is no substitute for human judgment.
Any workflow or technology incorporating LLMs has to keep humans in the loop, and not merely as rubber stamps. The human has to remain the primary decision maker.
> Optimize your bottom line for token spending so you collect $$$$
> Release Ultracode feature that optimize for Token Spending (a.k.a Dynamic Workflows)
> Tokenmaxxing achieved + 529 Overloaded unsustainable APIs everywhere
This video, wow: https://www.threads.com/@founder__growth/post/DZz_9Ikj3Wx
Out of desperation, I moved to ChatGPT and it's working better than I remember. All these companies are playing games under load, under failure. No wonder we can't agree on what's good for what.
ClaudeCode still has a 99.27 % uptime
ClaudeCowork has 99.52 % uptime
ClaudeForGovernment has 99.93 % uptime
Today is the Latvian holiday of Jāņi, to mark the passage of the summer solstice: https://en.wikipedia.org/wiki/J%C4%81%C5%86i
Grab yourselves some beer or beverage of choice and some cheese (we usually have caraway cheese), alongside skewered meat and get some rest!
I mean, what else am I going to do while Claude is down, write code manually, like they did in the 90s or something?
"it's look like when the lights turning off, we return to socialize lol"
What can your company do?
Anthropic has massive capability issues due to massive user growth. It happens often when EU and US work hours collide. They have smart people working on it. Don’t waste your energy complaining.
Cheers
ps. if you say you still capable of developing software without the Internet, you're lying. Perhaps, to your own self.
:)