Today, the preview version of our new model series, DeepSeek-V4, is officially launched and open-sourced simultaneously.

DeepSeek-V4 features an ultra-long context window of one million tokens, achieving leading performance domestically and within the open-source community in agent capabilities, world knowledge, and reasoning. The model is available in two versions based on size:

Model comparison chart

Starting today, log in to the official website chat.deepseek.com or the official app to converse with the latest DeepSeek-V4 and explore the new experience of 1M ultra-long context memory. API services have been updated synchronously; simply modify the model_name to deepseek-v4-pro or deepseek-v4-flash to invoke the models.

DeepSeek-V4-Pro: Performance Rivaling Top Closed-Source Models

Performance metrics

Significantly Enhanced Agent Capabilities: Compared to previous generations, DeepSeek-V4-Pro demonstrates a marked improvement in agent capabilities. In Agentic Coding evaluations, V4-Pro has reached the best level among current open-source models and performs excellently in other agent-related benchmarks. Currently, DeepSeek-V4 has become the Agentic Coding model used internally by company employees. Evaluation feedback indicates that the user experience surpasses Sonnet 4.5, and delivery quality approaches Opus 4.6 in non-thinking mode, though a gap remains compared to Opus 4.6 in thinking mode.
Rich World Knowledge: In world knowledge assessments, DeepSeek-V4-Pro significantly leads other open-source models, trailing only slightly behind the top-tier closed-source model Gemini-Pro-3.1.
World-Class Reasoning Performance: In evaluations covering mathematics, STEM, and competitive coding, DeepSeek-V4-Pro surpasses all currently publicly evaluated open-source models, achieving results comparable to world-leading closed-source models.

Reasoning performance chart

DeepSeek-V4-Flash: A Faster, More Economical Choice

Compared to DeepSeek-V4-Pro, DeepSeek-V4-Flash is slightly less proficient in world knowledge reserves but demonstrates reasoning capabilities close to its Pro counterpart. Due to smaller model parameters and activation, V4-Flash offers faster and more economical API services.
In agent evaluations, DeepSeek-V4-Flash performs on par with DeepSeek-V4-Pro on simple tasks but shows a gap in high-difficulty tasks.

Structural Innovation and Ultra-High Context Efficiency

DeepSeek-V4 pioneers a novel attention mechanism that compresses along the token dimension, combined with DSA (DeepSeek Sparse Attention), achieving globally leading long-context capabilities while significantly reducing computational and VRAM requirements compared to traditional methods. From now on, a 1M (one million) token context window will be standard across all official DeepSeek services.

Context efficiency comparison

Computational load and VRAM capacity changes relative to context length for DeepSeek-V4 and DeepSeek-V3.2.

Specialized Optimization for Agent Capabilities

DeepSeek-V4 has been adapted and optimized for mainstream agent products such as Claude Code, OpenClaw, OpenCode, and CodeBuddy, showing improvements in code tasks and document generation. The image below shows an example of a PPT slide generated by V4-Pro within an agent framework:

PPT generation example

Swipe up/down or click to enlarge.

API Access

Currently, the DeepSeek API has synchronously launched V4-Pro and V4-Flash, supporting both OpenAI ChatCompletions and Anthropic interfaces. When accessing the new models, the base_url remains unchanged, but the model parameter must be set to deepseek-v4-pro or deepseek-v4-flash.

API configuration

Both V4-Pro and V4-Flash support a maximum context length of 1M and are available in both non-thinking mode and thinking mode. The thinking mode supports the reasoning_effort parameter to set the thinking intensity (high/max). For complex agent scenarios, it is recommended to use thinking mode with the intensity set to max. Please refer to the API documentation for model invocation and parameter adjustment methods:

https://api-docs.deepseek.com/zh-cn/guides/thinking_mode

Please note: The two legacy model names for the existing API interface, deepseek-chat and deepseek-reasoner, will be discontinued in three months (on 2026-07-24). During the current phase, these two model names point to the non-thinking and thinking modes of deepseek-v4-flash, respectively.

Open Weights and Local Deployment

DeepSeek-V4 model open-source links:

https://huggingface.co/collections/deepseek-ai/deepseek-v4

https://modelscope.cn/collections/deepseek-ai/DeepSeek-V4

DeepSeek-V4 Technical Report:

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf

Closing Thoughts

"Not swayed by praise, nor frightened by slander; follow the path, and remain upright."

We thank every user for their trust and support. Your affirmation, suggestions, and expectations are the driving force behind our relentless exploration and continuous progress, allowing us to stay true to our original aspiration and focus on ceaseless innovation.

We will always uphold the principle of long-termism, moving forward steadily through trial and reflection, striving to get closer to the goal of achieving AGI.

DeepSeek Logo

DeepSeek-V4 Preview: Entering the Era of Accessible Million-Token Context