Official Introduction: Hunyuan Hy3 Preview

More than three months ago, the Hunyuan team made a decision: to tear down and rebuild from the ground up.

Pre-training, rebuilt. Reinforcement learning, rebuilt. Infrastructure, rebuilt.

Today marks the debut of the first outcome following this reconstruction: Hy3 preview (click to experience directly on the official website).

Let's start with the conclusion

It is a Mixture-of-Experts (MoE) model integrating fast and slow thinking. With 295B total parameters, 21B activated parameters, and a 256K context window, this is Hunyuan's most intelligent model to date. It focuses on comprehensive practicality, with significantly enhanced Agent capabilities.

| Several Hard Metrics

Complex Reasoning: Solid Performance in Hardcore Exams

Reasoning is the foundation of all capabilities. Hy3 preview performs outstandingly on high-difficulty STEM leaderboards such as FrontierScience-Olympiad and IMOAnswerBench, and has also achieved excellent results in real-world exam settings:

Tsinghua University Qiuzhen College Mathematics PhD Qualifying Exam (Spring 2026) — Highest score among domestic models

China High School Biology Olympiad (CHSBO) 2025 — Excellent performance

Validation across both leaderboards and real exam halls indicates that this capability isn't just memorized for tests; it represents a structural strengthening. Whether deriving formulas from research papers or explaining difficult problems to students, it is highly likely to be sufficient.

Code and Agents: Rapidly补齐 Key Capabilities

Agents are one of the most significantly improved directions in this iteration. Writing code, researching materials, and using tools to complete tasks—it can truly get work done for you. Complex multi-step information retrieval tasks, such as cross-webpage comparison, filtering, and summarization, have also been comprehensively addressed in this update.

Input a single prompt, and you can get a mini-program that runs on WeChat, or even a small game.

Prompt: I want to make a small game about gathering and building on a small planet. The player lands on a desolate small planet in the clouds, moves via joystick, automatically gathers vegetation and ores, and consumes resources to build automatically. Accompanied by fresh visuals and light sound effects.

若影片無法播放，請改看來源頁。

In Tencent Docs AI Assistant "Kaiwu AI," inputting a single prompt allows for direct PPT drafting.

若影片無法播放，請改看來源頁。

Input a single prompt, and you get a mini-program ready to run on WeChat. Hy3 preview outputs all page code and configuration files at once; importing them into the WeChat Developer Tools allows for immediate preview without back-and-forth debugging.

Prompt: Help me make a hiking route recommendation mini-program. It needs a homepage carousel, route detail pages, and a favorites function.

Hy3 preview outputs all page code and configuration files at once. Import into WeChat Developer Tools to preview immediately. No need for repeated adjustments.

For technical peers looking at details: Competitive results were achieved in mainstream evaluations such as SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, and WideSearch.

Code Capability Evaluation

Code eval chart

Agent Comprehensive Capability Evaluation: Hy3 preview demonstrates high cost-performance ratio

Long Context and Instruction Following: Proven in Real Scenarios

Information in real-world scenarios is always messy: a meeting minute stuffed with seven or eight hidden premises, a travel plan hiding sudden budget changes, a task description mixing "who is on leave this week" with "who is working overtime next week"...

Project planning, travel summaries, reading records, chat planning, business transformation... Hy3 can peel back the layers to extract intentions and requirement clues from speech without missing details or making blind guesses. It accurately helps you summarize them into To-Dos.

Want to see more real-life cases? Click here to learn more.

| Natural Conversation: Adding More Human Touch to Chatting

Previous responses always had a bit of a "machine flavor"—you say "I haven't been feeling well lately," and it lists five suggestions.

Now, Hy3 preview will first catch your emotion, then continue the conversation.

When you ask it to write, the machine flavor fades; when you ask questions, the metaphors are more vivid, and the examples are more pertinent.

When you pour your heart out, it no longer keeps its distance. The AI tone in writing is lighter, and the metaphors in answers are more vivid—more like a person who is listening carefully and thinking before responding.

| The Products You Use Have Already Upgraded to the New Model

Yuanbao

Writing, chatting, and searching have been comprehensively upgraded. Daily chatting, writing, and researching—conversations feel more "human," and irrelevant answers are reduced.

"It understands your meaning better, and what it writes feels more human."

— Yuanbao Product Manager

CodeBuddy / WorkBuddy

Response speed has nearly doubled, and it can stably complete complex tasks of nearly 500 steps. Tencent's internal engineers are already using it daily for coding, with an internal blind test win rate of 55%–56%.

"First response time improved by 54%, task completion time shortened by 47%, success rate 99.99%+"

— CodeBuddy/WorkBuddy Product Manager

ima

Throw in a document of tens of thousands of words; whether in the knowledge base or for general Q&A, it finds what needs to be found and summarizes comprehensively.

"Excellent capability in processing long texts; accuracy, coverage, and comprehensiveness of answers perform very well."

— ima Product Manager

| Three Principles of the Hy3 Preview Reconstruction

Systematic Capability

We do not advocate being "biased"—even for a Code Agent, the backend involves the synergy of reasoning, instructions, long-context handling, and dialogue capabilities.

Authenticity of Evaluation

High benchmark scores ≠ usability. We actively step away from public leaderboards that are easy to game. Through 50+ self-built evaluation systems, latest exams, human evaluation, and product crowdsourcing, we assess real combat effectiveness.

Pursuit of Cost-Performance

Deeply 协同 model architecture and inference frameworks to significantly reduce task costs, making intelligence affordable and effective.

| Also Open Source: Developers Can Use It Directly

Hy3 preview's inference efficiency has improved by 40%. Model weights and code are fully open source and free to download on GitHub and Hugging Face.

If you want to call via API, Tencent Cloud TokenHub has exclusive packages:

Pricing chart

Input as low as 1.2 CNY/million tokens, output as low as 4 CNY/million tokens. For most individual developers, 28 CNY a month is basically sufficient.

| This Is Just the Beginning

Hy3 preview is a starting point.

The Hunyuan team is continuously expanding the scale of pre-training and reinforcement learning; larger models are already in training. Meanwhile, through in-depth co-design with more Tencent product scenarios, we will continue to improve the model's performance in real-world scenarios.

Welcome to use, welcome to critique.

The feedback you generate from usage is more valuable than what we test ourselves.

Official Introduction: Hunyuan Hy3 Preview

Related Articles

分享網址