More than three months ago, the Hunyuan team made a decision: to tear down and rebuild from the ground up.
Pre-training, rebuilt. Reinforcement learning, rebuilt. Infrastructure, rebuilt.
Today marks the debut of the first outcome following this reconstruction: Hy3 preview (click to experience directly on the official website).
Let's start with the conclusion
It is a Mixture-of-Experts (MoE) model integrating fast and slow thinking. With 295B total parameters, 21B activated parameters, and a 256K context window, this is Hunyuan's most intelligent model to date. It focuses on comprehensive practicality, with significantly enhanced Agent capabilities.
| Several Hard Metrics
Complex Reasoning: Solid Performance in Hardcore Exams
Reasoning is the foundation of all capabilities. Hy3 preview performs outstandingly on high-difficulty STEM leaderboards such as FrontierScience-Olympiad and IMOAnswerBench, and has also achieved excellent results in real-world exam settings:
Tsinghua University Qiuzhen College Mathematics PhD Qualifying Exam (Spring 2026) — Highest score among domestic models
China High School Biology Olympiad (CHSBO) 2025 — Excellent performance
Validation across both leaderboards and real exam halls indicates that this capability isn't just memorized for tests; it represents a structural strengthening. Whether deriving formulas from research papers or explaining difficult problems to students, it is highly likely to be sufficient.
Code and Agents: Rapidly补齐 Key Capabilities
Agents are one of the most significantly improved directions in this iteration. Writing code, researching materials, and using tools to complete tasks—it can truly get work done for you. Complex multi-step information retrieval tasks, such as cross-webpage comparison, filtering, and summarization, have also been comprehensively addressed in this update.
Input a single prompt, and you can get a mini-program that runs on WeChat, or even a small game.
Prompt: I want to make a small game about gathering and building on a small planet. The player lands on a desolate small planet in the clouds, moves via joystick, automatically gathers vegetation and ores, and consumes resources to build automatically. Accompanied by fresh visuals and light sound effects.
In Tencent Docs AI Assistant "Kaiwu AI," inputting a single prompt allows for direct PPT drafting.
Input a single prompt, and you get a mini-program ready to run on WeChat. Hy3 preview outputs all page code and configuration files at once; importing them into the WeChat Developer Tools allows for immediate preview without back-and-forth debugging.
Prompt: Help me make a hiking route recommendation mini-program. It needs a homepage carousel, route detail pages, and a favorites function.
Hy3 preview outputs all page code and configuration files at once. Import into WeChat Developer Tools to preview immediately. No need for repeated adjustments.
For technical peers looking at details: Competitive results were achieved in mainstream evaluations such as SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, and WideSearch.
Code Capability Evaluation
Agent Comprehensive Capability Evaluation: Hy3 preview demonstrates high cost-performance ratio
Long Context and Instruction Following: Proven in Real Scenarios
Information in real-world scenarios is always messy: a meeting minute stuffed with seven or eight hidden premises, a travel plan hiding sudden budget changes, a task description mixing "who is on leave this week" with "who is working overtime next week"...
Project planning, travel summaries, reading records, chat planning, business transformation... Hy3 can peel back the layers to extract intentions and requirement clues from speech without missing details or making blind guesses. It accurately helps you summarize them into To-Dos.
Want to see more real-life cases? Click here to learn more.
| Natural Conversation: Adding More Human Touch to Chatting
Previous responses always had a bit of a "machine flavor"—you say "I haven't been feeling well lately," and it lists five suggestions.
Now, Hy3 preview will first catch your emotion, then continue the conversation.
When you ask it to write, the machine flavor fades; when you ask questions, the metaphors are more vivid, and the examples are more pertinent.
When you pour your heart out, it no longer keeps its distance. The AI tone in writing is lighter, and the metaphors in answers are more vivid—more like a person who is listening carefully and thinking before responding.
| The Products You Use Have Already Upgraded to the New Model
Yuanbao
Writing, chatting, and searching have been comprehensively upgraded. Daily chatting, writing, and researching—conversations feel more "human," and irrelevant answers are reduced.
"It understands your meaning better, and what it writes feels more human."
— Yuanbao Product Manager
CodeBuddy / WorkBuddy
Response speed has nearly doubled, and it can stably complete complex tasks of nearly 500 steps. Tencent's internal engineers are already using it daily for coding, with an internal blind test win rate of 55%–56%.
"First response time improved by 54%, task completion time shortened by 47%, success rate 99.99%+"
— CodeBuddy/WorkBuddy Product Manager
ima
Throw in a document of tens of thousands of words; whether in the knowledge base or for general Q&A, it finds what needs to be found and summarizes comprehensively.
"Excellent capability in processing long texts; accuracy, coverage, and comprehensiveness of answers perform very well."
— ima Product Manager
| Three Principles of the Hy3 Preview Reconstruction
Systematic Capability
We do not advocate being "biased"—even for a Code Agent, the backend involves the synergy of reasoning, instructions, long-context handling, and dialogue capabilities.
Authenticity of Evaluation
High benchmark scores ≠ usability. We actively step away from public leaderboards that are easy to game. Through 50+ self-built evaluation systems, latest exams, human evaluation, and product crowdsourcing, we assess real combat effectiveness.
Pursuit of Cost-Performance
Deeply 协同 model architecture and inference frameworks to significantly reduce task costs, making intelligence affordable and effective.
| Also Open Source: Developers Can Use It Directly
Hy3 preview's inference efficiency has improved by 40%. Model weights and code are fully open source and free to download on GitHub and Hugging Face.
If you want to call via API, Tencent Cloud TokenHub has exclusive packages:
Input as low as 1.2 CNY/million tokens, output as low as 4 CNY/million tokens. For most individual developers, 28 CNY a month is basically sufficient.
| This Is Just the Beginning
Hy3 preview is a starting point.
The Hunyuan team is continuously expanding the scale of pre-training and reinforcement learning; larger models are already in training. Meanwhile, through in-depth co-design with more Tencent product scenarios, we will continue to improve the model's performance in real-world scenarios.
Welcome to use, welcome to critique.
The feedback you generate from usage is more valuable than what we test ourselves.