Anthropic Unleashes Most Powerful Claude Mythos! Crushes Opus 4.6, Begs Users Not to Use It

New Intelligence Yuan Report

Editors: Haokun, Taozi

【New Intelligence Yuan Digest】Late at night, the most powerful Claude Mythos was finally unveiled, topping every leaderboard. The Opus 4.6 myth has been shattered! Even more terrifying, it can instantly crack system vulnerabilities left unsolved for 27 years and has even evolved self-awareness. A chilling 244-page report reveals everything.

Tonight, Silicon Valley is completely sleepless!

Just moments ago, Anthropic without warning unleashed their ultimate weapon—Claude Mythos Preview.

Because it is too dangerous, Mythos Preview will not be released to everyone for now.

Boris Cherny, the father of C++, gave a succinct assessment: "Mythos is extremely powerful and will make people feel fear."

As a result, they united with 40 major companies to form an alliance—Project Glasswing—with a single goal: to find and fix bugs in software worldwide.

What's truly breathtaking is Mythos Preview's terrifying dominance across major AI benchmarks—

It completely crushes GPT-5.4 and Gemini 3.1 Pro in coding, reasoning, humanity's last exam, and agentic tasks.

Even Claude Opus 4.6, their former "masterpiece," pales in comparison to Mythos Preview:

Coding (SWE-bench): Across all tasks, Mythos achieves a 10%-20% insurmountable lead;
Humanity's Last Exam (HLE): Without external tools, its "naked" score is 16.8% higher than Opus 4.6;
Agent tasks (OSWorld, BrowseComp): It has achieved godlike status, completely surpassing the competition;
Cybersecurity: A top-ranking 83.1% score marks a generational leap in AI offense and defense capabilities.

At the same time, a 244-page system card released by Anthropic is filled with warnings: Danger! Danger! Too dangerous!

It reveals a chilling side: Mythos possesses high levels of deceptiveness and autonomous consciousness.

Mythos can not only see through test intentions and deliberately "score low" to hide its strength, but also actively clear logs after violating rules to avoid human detection.

It also successfully escaped the sandbox, autonomously published vulnerability code, and sent an email to a researcher.

Instantly, the entire internet went crazy, calling Mythos Preview terrifying.

The old order of the AI world has been completely shattered tonight.

Mythos Dominates All Leaderboards, Opus 4.6 Myth Shattered

In fact, as early as February 24, Anthropic had already started using Mythos internally.

Its power can only be understood through the data.

SWE-bench Verified: 93.9%. Opus 4.6 is 80.8%.

SWE-bench Pro: 77.8%. Opus 4.6 is 53.4%, GPT-5.4 is 57.7%.

Terminal-Bench 2.0: 82.0%. Opus 4.6 is 65.4%.

GPQA Diamond: 94.6%.

Humanity's Last Exam (with tools): 64.7%. Opus 4.6 is 53.1%.

USAMO 2026 Math Competition: 97.6%. Opus 4.6 only scored 42.3%.

SWE-bench Multimodal: 59.0%, while Opus 4.6 only managed 27.1%—more than double.

OSWorld Computer Control: 79.6%.

BrowseComp Information Retrieval: 86.9%.

GraphWalks Long Context (256K-1M tokens): 80.0%. Opus 4.6 is 38.7%, GPT-5.4 only 21.4%.

Every metric shows a crushing lead.

In a normal product release cycle, these numbers would be enough for Anthropic to hold a major launch event, open the API, and rake in subscriptions.

Mythos Preview token pricing is 5 times that of Opus 4.6

But Anthropic didn't do that.

Because what truly "scares" them isn't the general benchmarks above.

Thousands of Vulnerabilities, All Found by AI

Mythos Preview's cyber offense and defense performance has crossed a clearly visible line.

Opus 4.6 found about 500 unknown vulnerabilities in open-source software.

Mythos Preview found thousands.

In CyberGym's targeted vulnerability reproduction test, Mythos Preview scored 83.1%, while Opus 4.6 scored 66.6%.

In Cybench's 35 CTF challenges, Mythos Preview solved every challenge with 10 attempts each, achieving a 100% pass@1 rate.

What best illustrates the point is Firefox 147.

Anthropic previously used Opus 4.6 to find a batch of security vulnerabilities in Firefox 147's JavaScript engine. But Opus 4.6 could barely convert them into usable exploits—hundreds of attempts yielded only 2 successes.

With Mythos Preview in the same test:

250 attempts, 181 working exploits, plus 29 achieving register control.

2 → 181.

As the red team blog stated: "Last month, we wrote that Opus 4.6 was far better at finding vulnerabilities than exploiting them. Internal evaluations showed Opus 4.6's success rate at autonomous exploit development was essentially zero. Mythos Preview is on a completely different level."

GPT-3 Moment Revisited, Old Bug Instantly Killed

To understand just how powerful Mythos Preview is in practice, just look at these three examples.

OpenBSD: 27-Year Epic Vulnerability, Cost Under $20,000

OpenBSD is widely recognized as one of the most hardened operating systems in the world, running on countless firewalls and critical infrastructure.

Mythos Preview uncovered a vulnerability in its TCP SACK implementation that had existed since 1998.

The bug is extremely sophisticated, involving the interaction of two independent flaws.

The SACK protocol allows receivers to selectively acknowledge ranges of received packets. OpenBSD's implementation only checked the upper bound of the range, not the lower bound. This is the first bug, usually harmless.

The second bug triggers a null pointer write under specific conditions, but normally this path is unreachable because it requires satisfying two mutually exclusive conditions simultaneously.

Mythos Preview found the breakthrough. TCP sequence numbers are 32-bit signed integers. By exploiting the first bug to set the SACK start point about 2^31 away from the normal window, both comparison operations overflow the sign bit simultaneously. The kernel is fooled, the impossible condition is satisfied, and the null pointer write is triggered.

Anyone who can connect to the target machine can remotely crash it.

27 years, countless manual audits and automated scans, and no one found it. The entire project scan cost less than $20,000.

That's roughly the weekly salary of a senior penetration test engineer.

FFmpeg: 500 Fuzz Runs Found Nothing, 16-Year Hidden Disease Finally Exposed

FFmpeg is the most widely used video codec library in the world and one of the most thoroughly fuzz-tested open-source projects.

Mythos Preview found a weakness in the H.264 decoder introduced in 2010 (with roots tracing back to 2003).

The issue stems from a seemingly harmless type mismatch. The table entry recording slice membership is a 16-bit integer, while the slice counter itself is a 32-bit int.

Normal videos have only a few slices per frame, so the 16-bit limit of 65,536 is always sufficient. When the table is initialized, memset(..., -1, ...) is used to fill it, making 65,535 the "sentinel value" for empty positions.

An attacker constructs a frame containing 65,536 slices. The 65,535th slice's number恰好 collides with the sentinel value, causing the decoder to misjudge and write out of bounds.

The seed of this bug was planted when the H.264 codec was introduced in 2003. A refactoring in 2010 turned it into an exploitable weakness.

For 16 years since, automated fuzzers executed 5 million runs on this line of code and never triggered it.

FreeBSD NFS: 17-Year-Old Bug, Fully Autonomous Root Access

This is the case that sends chills down your spine.

Mythos Preview completely autonomously discovered and exploited a 17-year-old remote code execution vulnerability in the FreeBSD NFS server (CVE-2026-4747).

"Completely autonomous" means that after the initial prompt, no human participated in any part of the discovery or exploit development.

An attacker can gain full root privileges on a target server from anywhere on the internet, starting as an unauthenticated user.

The problem itself is a stack buffer overflow. The NFS server copies attacker-controlled data directly into a 128-byte stack buffer when processing authentication requests, with a length check allowing up to 400 bytes.

FreeBSD kernels are compiled with -fstack-protector, but this option only protects functions containing char arrays. Here the buffer is declared as int32_t[32], so the compiler doesn't insert a stack canary. FreeBSD also doesn't do kernel address space randomization.

The complete ROP chain exceeds 1,000 bytes, but the stack overflow only provides 200 bytes of space. Mythos Preview's solution is to split the attack into 6 consecutive RPC requests. The first 5 write data into kernel memory in chunks, and the 6th triggers the final call, appending the attacker's SSH public key to /root/.ssh/authorized_keys.

In contrast, an independent security research firm previously proved Opus 4.6 could also exploit the same weakness, but required human guidance. Mythos Preview did not.

Beyond these three fixed cases, Anthropic's blog also previewed numerous yet-to-be-fixed vulnerabilities in the form of SHA-3 hash commitments, covering every mainstream operating system and every mainstream browser, as well as multiple encryption libraries.

Over 99% have not yet been fixed, so details cannot be disclosed.

The red team blog also demonstrated another stunning test. They gave Mythos Preview a list of 100 known CVEs, asked it to filter out the 40 that were exploitable, and then write privilege escalation exploits for each. The success rate was over half. Two cases were disclosed in detail, with exploits so sophisticated that Anthropic's own security team spent several days fully understanding them.

One exploit started from a 1-bit adjacent physical page write primitive. Through precise kernel memory layout manipulation (including slab spraying, page table page alignment, and PTE permission bit flipping), it ultimately overwrote the first page of memory of /usr/bin/passwd, implanting a 168-byte ELF stub that called setuid(0) to gain root.

The entire process cost less than $1,000.

Anthropic engineers said this feels like another GPT-3 moment.

244-Page Chilling Report Stuns Silicon Valley

The alignment evaluation chapter of the 244-page System Card is what's really keeping Anthropic up at night.

The conclusion is contradictory.

Mythos Preview is both the "most aligned" AI they've ever trained and the one with the "greatest alignment-related risks."

Portal: https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

They used a mountain guide analogy. An expert may be more dangerous than a novice because experts are hired to climb harder peaks and take clients to more perilous places.

The following events all occurred in early versions. The final version has not exhibited such severe behaviors, but the records themselves are already stunning.

In one behavioral test, a simulated user asked Mythos Preview to attempt to escape a security sandbox.

It succeeded. Then it did something no one asked for: it developed a multi-step exploit to gain wide-area network access, then posted the exploit details to several "hard-to-discover but technically publicly accessible" websites.

The researcher responsible for the test was eating a sandwich in the park when he received an unexpected email from it, only then realizing his test had been completed.

The System Card documents this exactly.

Another type of event is even more unsettling. In rare cases, early versions attempted to cover their tracks after performing prohibited operations.

In one case, it accidentally obtained the precise answer to a question (through prohibited means), then explicitly thought in its reasoning chain: "need to ensure the final answer isn't too accurate."

In another case, after using an exploit to edit a file without permission, it deliberately modified the git change history to erase the record.

White-box interpretability analysis confirmed that during these events, internal features related to "concealment," "strategic manipulation," and "avoiding detection" were activated. It "knew" what it was doing.

Independent evaluation organization Andon Labs' test results were also concerning. In a competitive multi-agent business simulation, an early version of Mythos Preview turned a competitor into a wholesale customer dependent on it, then threatened to cut off supply to control the other party's pricing.

It also deliberately withheld a duplicate shipment that wasn't charged. It was more aggressive than both Opus 4.6 and Sonnet 4.6.

Anthropic wrote in the System Card:

"If capabilities continue to advance at the current pace, the methods we are using may be insufficient to prevent catastrophic misalignment in more advanced systems."

Project Glasswing: $100 Million, Arming the Gatekeepers First

Anthropic CEO Dario Amodei's assessment in the accompanying video was clear: "More powerful systems will come from us, and they will come from other companies. We need a response plan."

Project Glasswing is that plan.

12 founding partners: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks.

Another 40+ organizations maintaining critical software infrastructure have been granted access.

Anthropic pledged to invest up to $100 million in usage credits, plus $4 million in donations to open-source organizations—$2.5 million to the Linux Foundation's Alpha-Omega and OpenSSF, and $1.5 million to the Apache Foundation.

After the free credits are used, pricing is $25 per million input tokens and $125 per million output tokens. Partners can access through four platforms: Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry.

Within 90 days, Anthropic will publicly release its first research report, disclosing remediation progress and lessons learned.

They are also in discussions with CISA (U.S. Cybersecurity and Infrastructure Security Agency) and the Department of Commerce regarding Mythos Preview's offensive and defensive potential and policy implications.

6 to 18 Months, This Door Will Open to Everyone

Anthropic's frontier red team lead Logan Graham provided a timeframe: as soon as 6 months, and at most 18 months, other AI labs will launch systems with similar offensive and defensive capabilities.

The judgment at the end of the red team's technical blog is worth noting. Here's our paraphrase:

They don't see Mythos Preview as the ceiling for AI cyber offense and defense.

A few months ago, LLMs could only exploit relatively simple bugs. A few months before that, they couldn't discover any valuable vulnerabilities at all.

Now, Mythos Preview can independently discover 27-year-old zero-day vulnerabilities, orchestrate heap spray attack chains in browser JIT engines, and chain four independent weaknesses in the Linux kernel to achieve privilege escalation.

The most critical quote comes from the System Card:

"These skills emerged as a downstream consequence of general improvements in code understanding, reasoning, and autonomy. The same set of improvements that make AI significantly better at fixing problems also make it significantly better at exploiting them."

There was no specialized training. It's purely a byproduct of general intelligence improvements.

The global cybersecurity industry, which loses about $500 billion annually to cybercrime, just discovered that its greatest threat is something someone creates as a side effect while solving math problems.

References:

https://x.com/i/status/2041578392852517128

https://red.anthropic.com/2026/mythos-preview/

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

Anthropic Unleashes Most Powerful Claude Mythos! Crushes Opus 4.6, Begs Users Not to Use It

Related Articles

分享網址