The post is light on details, and I agree with the sentiment that it reads like marketing. That said, Opus 4.6 is actually a legitimate step up in capability for security research, and the red team at Anthropic – who wrote this post – are sincere in their efforts to demonstrate frontier risks.
Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.
If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.
Glad to see that they brought in humans to validate and patch vulnerabilities. Although, I really wish they linked to the actual patches. Here's what I could find:
Grepping for strcat() is at the "forefront of cybersecurity"? The other one that applied a GitHub comment to a different location does not look too difficult either.
Everything that comes out of Anthropic is just noise but their marketing team is unparalleled.
Evaluating and mitigating the growing risk of LLM-discovered 0-days
(red.anthropic.com)45 points by lebovic 5 February 2026 | 14 comments
Comments
Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.
If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.
https://clang-analyzer.llvm.org
Alternatively, testing these projects with ASan enabled:
https://clang.llvm.org/docs/AddressSanitizer.html
https://cgit.ghostscript.com/cgi-bin/cgit.cgi/ghostpdl.git/c...
https://github.com/OpenSC/OpenSC/pull/3554
https://github.com/dloebl/cgif/pull/84
Everything that comes out of Anthropic is just noise but their marketing team is unparalleled.
Yawn.