DeepSeek has unveiled DeepSeek-OCR: Contexts Optical Compression, an open-source model developed by its DeepSeek-AI research team. The new system introduces a visual-based method to compress long text contexts, improving recognition efficiency while cutting computation costs.
According to the team, DeepSeek-OCR surpasses several mainstream models in benchmark tests with far fewer visual tokens. It can also produce more than 200,000 pages of training data per day on a single A100-40G GPU, supporting both large language and vision-language model development. [TechNode reporting]
Related