DeepSeek-V3.2 Outperforms GPT-5 On Reasoning Tasks

DeepSeek released DeepSeek-V3.2, a family of open-source reasoning and agentic AI models. The high compute version, DeepSeek-V3.2-Speciale, performs better than GPT-5 and comparably to Gemini-3.0-Pro on several reasoning benchmarks.

DeepSeek applied three new techniques in the development of DeepSeek-V3.2. First, they used a more efficient attention mechanism called DeepSeek Sparse Attention (DSA) that reduces the computational complexity of the model. They also scaled the reinforcement learning phase, which consumed more compute budget than did pre-training. Finally, they developed an agentic task synthesis pipeline to improve the models’ tool use. The result was a model that outperforms most other open models on a range of coding, reasoning, and agentic benchmarks, and performs as well as or better than closed frontier models such as GPT-5 and Gemini-3.0-Pro. However, the DeepSeek team pointed out:

Despite these achievements, we acknowledge certain limitations when compared to frontier closed-source models…First, due to fewer total training FLOPs, the breadth of world knowledge in DeepSeek-V3.2 still lags behind that of leading proprietary models. We plan to address this knowledge gap in future iterations by scaling up the pre-training compute. Second, token efficiency remains a challenge…Future work will focus on optimizing the intelligence density of the model’s reasoning chains to improve efficiency. Third, solving complex tasks is still inferior to frontier models, motivating us to further refine our foundation model and post-training recipe.

InfoQ covered several of DeepSeek’s previous releases, including the initial DeepSeek-V3 launch and DeepSeek-R1, their first reasoning model; both were released in early 2025. Later in 2025, InfoQ covered DeepSeek-V3.1, a hybrid reasoning model that combines thinking and non-thinking modes in a single system.

DeepSeek-V3.2 Benchmark Performance. Image Source: DeepSeek Tech Report

DeepSeek-V3.2 uses the same architecture as DeepSeek-V3.1, except using the new DSA attention mechanism. The team started with a checkpoint of DeepSeek-V3.1 and extended the context length to 128K before continuing pre-training to produce DeepSeek-V3.2. The new attention mechanism reduces the computational complexity from O(^2) to O(), where L is context length and k<

For post-training, the team used specialist distillation. They trained a set of specialist models dedicated to a particular domain: coding, math, and several agent tasks. Then these specialist models produce synthetic training data that is used to fine-tune the main model.

In a Hacker News discussion about DeepSeek-V3.2, several users pointed out the advantages of a high-performing open model. One user wrote:

If you’re trying to build AI based applications you can and should compare the costs between vendor based solutions and hosting open models with your own hardware…Then you compare that to the cost of something like GPT-5, which is a bit simpler because the cost per (million) token is something you can grab off of a website. You’d be surprised how much money running something like DeepSeek (or if you prefer a more established company, Qwen3) will save you over the cloud systems…DeepSeek and Qwen will function on cheap GPUs that other models will simply choke on.

The DeepSeek-V3.2 model files are available to download from Huggingface. However, the high-compute DeepSeek-V3.2-Speciale variant is currently only available via DeepSeek’s API.