By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: KTransformers enables DeepSeek-R1 with low-cost graphics card · TechNode
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > KTransformers enables DeepSeek-R1 with low-cost graphics card · TechNode
Computing

KTransformers enables DeepSeek-R1 with low-cost graphics card · TechNode

News Room
Last updated: 2025/02/17 at 7:19 AM
News Room Published 17 February 2025
Share
SHARE

The KVCache.AI team from Tsinghua University, in partnership with APPROACHING.AI, announced a major update to the KTransformers open-source project last week, local media outlet National Business Daily reported on Saturday. Now, with a 24GB VRAM 4090D (NVIDIA GPU), users can run the full-powered DeepSeek-R1 and V3 671B version locally. Pre-processing speeds can reach up to 286 tokens per second, while inference generation speeds peak at 14 tokens per second.

Why it matters: Currently, users access DeepSeek-R1 mainly through cloud services or local deployment, but the official servers often suffer from downtime, and personal deployments usually involve a distilled version with 90% fewer parameters. Running the full version of DeepSeek-R1 on standard hardware is a major challenge for most users. Even developers find the cost of renting servers to be a heavy burden. The KTransformers open-source project offers an affordable solution to this issue.

Details: KTransformers breaks the limitation of AI large models relying on expensive cloud servers, according to the National Business Daily report.

  • A user analyzed the solution’s costs and found that running the DeepSeek R1 model locally could be done for under RMB 70,000 ($9,650) — over 95% cheaper than using NVIDIA A100/H100 servers, which can cost up to RMB 2 million ($280,000).
  • KTransformers optimizes the deployment of large language models (LLMs) on local machines to overcome resource limitations. The framework leverages innovative techniques, including heterogeneous computing, advanced quantization, and sparse attention mechanisms, to enhance computational efficiency while enabling the processing of long-context sequences.
  • However, the inference speed of KTransformers cannot compare with the cost of high-end servers, and it can only serve a single user at a time, whereas servers can simultaneously meet the demands of dozens of users, the report noted.
  • Currently, the overall solution also relies on Intel’s AMX instruction set, and CPUs from other brands are not yet capable of performing these operations. Additionally, this solution is primarily designed for DeepSeek’s MOE model; applying it to other mainstream models may not be optimal in terms of performance.
  • To use the KTransformers setup, Chinese media outlet IThome listed the following prerequisites: an Intel Xeon Gold 6454S CPU with 1TB DRAM (2 NUMA nodes), an RTX 4090D GPU with 24GB VRAM, 1TB of standard DDR5-4800 server memory, and CUDA version 12.1 or higher.

Context: On Jan. 20, the release of DeepSeek-R1 created headlines around the world and led many to suggest that the AI industry had entered a new phase where competition is more global, open-source models thrive, and cost efficiency is becoming a major factor in the development and deployment of AI systems.The published API (Application

  • Programming Interface) pricing for DeepSeek-R1 is as follows: RMB 1 ($0.14) per million input tokens (cache hit), RMB 4 ($0.55) per million input tokens (cache miss), and RMB 16 ($2.21) per million output tokens. This is roughly 1/30th of the operational cost of OpenAI’s GPT-4.

Related

Jessie Wu is a tech reporter based in Shanghai. She covers consumer electronics, semiconductor, and the gaming industry for TechNode. Connect with her via e-mail: [email protected].
More by Jessie Wu

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article What’s the best temperature for a gas condensing boiler to be set at?
Next Article Apple’s AI-Powered Siri Could Be Delayed Due to Software Bugs
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Epic resubmits ‘Fortnite’ to the App Store for review
News
I had to invent a new word to describe how I feel about Android’s latest Material 3 redesign
News
5 BCDR Essentials for Effective Ransomware Defense
Computing
Apple is placing warnings on EU apps that don’t use App Store payments
News

You Might also Like

Computing

5 BCDR Essentials for Effective Ransomware Defense

11 Min Read
Computing

Linux Patches Updated For Dropping Support For Very Old x86 CPUs

2 Min Read
Computing

China’s Zeekr and Lynk & Co chase 1 million annual sales target after merger · TechNode

2 Min Read
Computing

Kenya’s Craydel enters Rwanda, taps Nigerian talent in Pan-African push

4 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?