DeepSeek-V3.2 Challenges GPT-5 With Free 685B Model

Key Takeaway:

DeepSeek, a Chinese startup, has released two 685-billion-parameter models, DeepSeek-V3.2 and V3.2-Speciale, that match or surpass GPT-5 and Gemini-3.0-Pro on elite math, coding, and reasoning benchmarks, while running far more efficiently and being fully open source under an MIT license. The models are comparatively cheap to operate, designed for real-world reasoning with built-in tool use, and their open release intensifies global AI competition.

DeepSeek-V3.2 Released – Key Points

Two frontier-scale models released as free, open-source rivals to GPT-5
Hangzhou-based DeepSeek released DeepSeek-V3.2, an everyday reasoning assistant, and DeepSeek-V3.2-Speciale, a high-compute variant tuned for elite competitions. Both models are open source, inexpensive to run, and structured for real-world reasoning with integrated tool-use capabilities. Their release challenges current U.S. dominance in frontier AI and expands global access to systems that historically required significant capital and cloud-scale resources.
DeepSeek Sparse Attention (DSA) dramatically cuts long-context costs
DeepSeek introduced DeepSeek Sparse Attention (DSA), an attention mechanism that uses a “lightning indexer” to identify the most relevant token regions in long inputs. This trims computational overhead by approximately 50% and allows long-document handling at a fraction of previous costs. DSA enables the model to “skim” instead of applying full attention to every token, reducing compute demands for long documents by up to 70%. Processing a 128,000-token context now costs roughly $0.70 per million tokens compared to $2.40 for the V3.1-Terminus model.
685B parameters, 128K context, and efficient deployment at scale
DeepSeek-V3.2 and V3.2-Speciale both employ a 685-billion-parameter architecture with a 128,000-token context window, enabling entire research papers, large codebases, and multi-step workflows to be processed within a single pass. Their efficiency makes them accessible even to small teams and individual developers rather than only cloud-scale labs. The models support BF16 and FP8 deployment and are compatible with multiple inference providers, reflecting a strong focus on practical, production-grade use.
Olympiad-level performance that meets or beats GPT-5 and Gemini
On AIME 2025, DeepSeek-V3.2-Speciale achieves a 96.0% score, ahead of GPT-5-High at 94.6% and Gemini-3.0-Pro at 95.0%. On the Harvard-MIT Mathematics Tournament (HMMT), it scores 99.2%, exceeding Gemini’s 97.5%. The base V3.2 model, optimized for general use, reaches 93.1% on AIME and 92.5% on HMMT, slightly below top proprietary models but with substantially lower compute requirements. Overall, the Speciale variant demonstrates gold-medal-level performance across major global competitions and outperforms leading American models on elite mathematical and algorithmic tasks.
Gold medals in IMO and IOI, high ranks in ICPC and CMO
DeepSeek-V3.2-Speciale earns 35/42 points at the 2025 International Mathematical Olympiad (IMO), securing a gold medal; 492/600 at the International Olympiad in Informatics (IOI), also gold and 10th overall; and solves 10 of 12 problems at the ICPC World Finals to place second. It additionally wins gold at the China Mathematical Olympiad (CMO). All tests were conducted without internet access or external tools, strictly following official contest constraints, and DeepSeek has released final Olympiad submissions for independent verification.
Strong coding performance and real-world software debugging
On the SWE-Verified benchmark, which measures real-world software bug fixing, DeepSeek-V3.2 resolves 73.1% of issues versus GPT-5-High’s 74.9%. On Terminal Bench 2.0, focused on complex terminal-based coding workflows, DeepSeek scores 46.4%, significantly ahead of GPT-5-High at 35.2%. These results position V3.2 as a highly capable coding assistant that can handle multi-step, environment-aware development tasks and maintain competitive coding accuracy even when operating without external tools.
“Thinking with tools” and large-scale agentic training pipeline
DeepSeek-V3.2 maintains reasoning continuity across tool calls, addressing a longstanding weakness of AI agents that lose internal state when executing external actions such as running code or performing web searches. Training used more than 1,800 synthetic environments and over 85,000 complex instructions, covering scenarios like multi-day travel planning with tight budget constraints, multi-language debugging, and research workflows requiring repeated browser operations. This enables the model to execute multi-step, interdependent tasks such as planning a vacation under strict lodging and food constraints while simultaneously performing code tests and checking exchange rates.
Reinforcement learning and post-training compute scaled beyond prior models
A 2025 technical report on Hugging Face describes a reinforcement learning framework that significantly increases post-training compute, which now exceeds 10% of pre-training cost. This expanded post-training phase, combined with the tool-rich synthetic pipeline, is credited with driving the model’s high reasoning accuracy and competitive performance against proprietary frontier systems. The report notes that the breadth of world knowledge in DeepSeek-V3.2 still lags some leading closed models, and further increases in pre-training compute are planned to reduce that gap.
MIT-licensed release, OpenAI-compatible chat template, and migration tools
The models are released under the MIT license, allowing unrestricted copying, modification, and commercialization of the weights. Migration from existing GPT-style applications is simplified by an OpenAI-compatible chat template, a new developer role for agentic workflows, and Python utilities for encoding and decoding messages. This level of openness contrasts with prevailing industry practice, where model weights are typically protected as proprietary intellectual property and accessed only via paid APIs.
Regulatory and geopolitical headwinds in Europe and the United States
German authorities have challenged DeepSeek over data-transfer practices, characterizing transfers of German user data to China as unlawful under EU rules and urging Apple and Google to block the app. Italy banned the app earlier in 2025, and lawmakers in the United States have requested removal of the service from government devices on national security grounds. These developments illustrate how geopolitical context and data-governance concerns shape perceptions of DeepSeek’s open model strategy and raise questions about whether U.S. firms can justify premium pricing when open alternatives offer comparable performance.
Temporary Speciale API, merging roadmap, and implications for AI competition
DeepSeek-V3.2-Speciale is initially accessible only through a temporary API until 15 December 2025, after which its capabilities will be merged into the public V3.2 release. This roadmap signals a broader industry shift: the global AI race is no longer only about model capabilities but increasingly about access, cost, and control. By releasing a GPT-5-class, 685B-parameter system under an MIT license, DeepSeek challenges the economic logic of closed, API-based frontier models and reshapes the competitive landscape between open and proprietary ecosystems.

Why This Matters

DeepSeek-V3.2 and V3.2-Speciale introduce unprecedented open access to a GPT-5-class system, combining gold-medal performance in top math and informatics competitions with a roughly 70% reduction in long-context inference costs. Their MIT-licensed release disrupts commercial models built around proprietary APIs and challenges assumptions that frontier AI requires exclusive, resource-intensive laboratories. Enterprises, researchers, and independent developers gain GPT-scale capabilities that are both free and operationally efficient. At the same time, regulatory blocks in Europe and scrutiny in the United States show that geopolitical factors and data-governance rules will determine where and how such open models can actually be deployed. The release indicates that the next phase of AI competition will be defined not only by raw capability, but also by openness, cost structure, regulatory constraints, and global political context.

This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.

DeepSeek-GRM: A Self-Improving AI Framework Developed by DeepSeek and Tsinghua University

DeepSeek, in partnership with Tsinghua University, unveils DeepSeek-GRM, an AI framework that improves reasoning through self-principled critique tuning and generative reward modeling, aiming for more efficient and intelligent AI systems.

DeepSeek-R2 Silently Launches, Setting New Global AI Performance

Officially announced on April 27, DeepSeek-R2 introduces a new AI model with robust multilingual, coding, and multimodal strengths—reinforced by the math-specialist Prover-V2.

Read a comprehensive monthly roundup of the latest AI news!