DeepSeek has launched and open-sourced DeepSeek-V3.2-Exp, an experimental large language model positioned as a step toward its next-generation architecture. The model introduces DeepSeek Sparse Attention, a fine-grained sparse attention mechanism designed to improve efficiency in long-text training and inference while maintaining output quality.
Benchmarked against the previous V3.1-Terminus model under aligned training settings, V3.2-Exp delivered comparable results across public evaluation datasets. The model is now available on Hugging Face and ModelScope, with the accompanying paper released on GitHub. DeepSeek has also updated its apps and developer platforms to V3.2-Exp, alongside an API price cut of more than 50%. [TechNode reporting]
Related