DeepSeek V4 Launches With 1M Context and Lower-Cost Claims

Key Takeaway

DeepSeek V4 is now available in preview through V4-Pro and V4-Flash, with open weights, a 1 million-token context window, lower-cost inference claims and new agentic coding capabilities. The launch is less likely to shock markets than R1, but it gives China another high-profile open-source AI release as competition with US models and domestic Chinese rivals intensifies.

DeepSeek V4 Launches – Key Points

The Story

China’s DeepSeek V4 preview launched on Friday, giving users and developers access to V4-Pro and V4-Flash through chat, API access and open weights on Hugging Face. Both models support a 1 million-token context window, dual Thinking and Non-Thinking modes, and improved reasoning, coding, knowledge processing and agentic workflows. Analysts expect a more limited market reaction than DeepSeek’s January 2025 R1 release, which rattled global tech markets and raised questions about AI infrastructure spending. DeepSeek V4 also highlights China’s push to build competitive open-source AI systems around long-context models, lower-cost inference and domestic AI infrastructure.

The Facts

DeepSeek V4 includes two preview models.
V4-Pro and V4-Flash are available through chat.deepseek.com, the DeepSeek API and open weights on Hugging Face. The release is positioned as a major upgrade in reasoning, coding and agentic workflows.
V4-Pro is the flagship model.
V4-Pro has 1.6 trillion total parameters and 49 billion active parameters. DeepSeek’s own benchmark claims place it ahead of rival open models in maths, STEM, coding, agentic workflows and world knowledge, while positioning it near leading closed-source systems.
V4-Flash is the faster, cheaper model.
V4-Flash has 284 billion total parameters and 13 billion active parameters. It is designed for faster responses, more economical API usage and simpler agent tasks where its reasoning capabilities are described as close to V4-Pro.
Both DeepSeek V4 models support a 1 million-token context window.
The 1M context window is now the default across official DeepSeek services. The release uses token-wise compression and DeepSeek Sparse Attention to reduce compute and memory costs.
DeepSeek V4 targets lower inference costs.
Inference costs are the computational and financial costs of running a trained AI model to generate outputs. The V4 release is framed around cheaper long-context use and more cost-effective agentic workflows.
DeepSeek V4 is optimized for agent tools.
The models integrate with Claude Code, OpenClaw and OpenCode, and are already being used for DeepSeek’s internal agentic coding workflows.
Early hands-on testing raises questions about real-world output quality.
Practical tests involving browser-based OS cloning, Slack-style UI generation, Three.js tasks, a Minecraft-style build and product-viewer generation produced mixed results. Several outputs appeared basic, buggy or visually underdeveloped, while V4-Flash sometimes performed better than expected on simpler prompts.
DeepSeek V4 follows the breakout R1 model from January 2025.
R1 gained global attention after DeepSeek claimed broadly comparable capabilities to ChatGPT and Gemini, with a reported two-month build time and less than $6 million in computing costs using lower-capacity Nvidia chips.
R1 had a major market impact.
The release alarmed investors, hit American AI stocks and triggered debate over whether Big Tech’s massive AI infrastructure spending was justified. Marc Andreessen called it “AI’s Sputnik moment.”
Analysts expect DeepSeek V4 to create less market disruption.
Ivan Su of Morningstar said R1 shocked US markets because few expected a Chinese model to compete at that level. V4 is now seen as part of a competitive trend markets have already absorbed.
DeepSeek V4 remains open source.
Like earlier DeepSeek models, V4 allows developers to use and modify the source code. The release is also described as using the MIT license.
DeepSeek V4 reflects China’s push toward domestic AI infrastructure.
Huawei confirmed that its latest AI computing cluster, powered by Ascend AI processors, can support DeepSeek V4. It remains unclear how extensively Huawei chips were used in training compared with Nvidia chips.

Benchmarks

DeepSeek’s own benchmarks place V4-Pro ahead of rival open models in agentic coding, maths, STEM, coding and world knowledge, while positioning it near top closed-source models. V4-Flash is presented as a faster, cheaper model with reasoning capabilities close to V4-Pro on simpler agent tasks. These remain company-reported claims from DeepSeek’s announcement and technical material. Early real-world testing raises a separate concern: DeepSeek V4 may perform better on benchmark-style tasks than on practical UI, frontend, SVG, Three.js and coding-generation work.

Numbers that Matter

2023: DeepSeek was founded.
Late 2024: DeepSeek gained attention with its free, open-source V3 model.
January 2025: DeepSeek released R1, the reasoning model that triggered its global breakout.
Less than $6 million: DeepSeek’s claimed computing cost for R1.
April 24, 2026: DeepSeek unveiled V4-Pro and V4-Flash previews.
1M tokens: Default context length across official DeepSeek V4 services.
1.6T / 49B: V4-Pro’s total and active parameter counts.
284B / 13B: V4-Flash’s total and active parameter counts.
$0.14 / $3.48: V4-Pro pricing per 1 million input and output tokens, based on the early pricing information included in the test material.
$0.03 / about $0.28: V4-Flash pricing per 1 million input and output tokens, based on the same early pricing information.
July 24, 2026, 15:59 UTC: deepseek-chat and deepseek-reasoner are scheduled to be fully retired and become inaccessible.
9% and 15%: SMIC and Hua Hong Semiconductor rose in Hong Kong trading after the V4 announcement.

Market Timing

DeepSeek V4 arrives after markets have already absorbed the idea that Chinese AI models can be competitive and cheaper to use than US alternatives. That is why analysts expect a more muted reaction than the shock created by R1. At the same time, V4 changes the competitive frame inside China, where domestic open-source AI players are now direct rivals rather than simply challengers to US models.

Why This Matters

DeepSeek V4 shifts the AI competition from a single-model performance story to a broader infrastructure, pricing and ecosystem story. The key question is not only whether DeepSeek V4 can rival OpenAI, Anthropic or Google on benchmarks, but whether China can scale competitive open-source AI systems using long-context models, domestic chips and lower-cost inference while delivering reliable real-world outputs.

This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.