Deepseek is looking to press home its advantage. The chinese startup triggered a $ 1 trillion (roughly Rs. Competitors.
Now, The Hangzhou-Based Firm is accelerating the launch of the successor to January’s R1 Model, According to Three People Familiar with the company.
Deepseek Had Planned to Release R2 in Early May But Wants It Out as Early as Possible, Two of Them Said, without providing specifics.
The company says it hopes the new model will produce better coding and be able to reason in languages beyond English. Details of the Accelerated Timeline for R2’s release have not been previously reported.
Deepseek did not respond to a request for comment for this story.
Rivals are still digesting the implications of R1, which was built with less-Powerful Nvidia chips but is competitive with those developed at the costs of hundreds of bills of billions by us tach giants by us.
“The launch of Deepsek’s R2 Model BE A Pivotal Moment in the AI Industry,” said Vijayasimha Alilughatta, Chief Operating Officer of Indian Tech Services Provider Zensar. Deepsek’s success at creating Cost-Effective Ai Models “Bold Likely Spur Companies Worldwide to Accelerate Their Own Efforts … Breaking the stranglehind of the folded Players in the FILELDEENT Players,” Said.
R2 is likely to worry the US government, which has identified Leadership of ai as a national priority. Its release may further Galvanise Chinese Authorities and Companies, dozens of which say they have started integrating Deepsek Models Into their products.
Little is knowledge about Deepsek, Whoose Founder Liang Wenfeng Became a Billionaire Through his Quantitative Hedege Fund High-Flyer. Liang, who was described by a former employer as “low-key and introverted,” has not spoken to any media since july 2024.
Reuters Interviewed a Dozen Former Employees, As Well as Quant Fund Professionals Knowledgeable About The Operations of Deepsek and its parent company. It also reviewed State Media Articles, Social-Media Posts from the Companies and Research Papers Dating Back to 2019.
They Told a Story of a company that functioned more like a research lab than a for-protrce enterprise and wased uncummed by the hierrachical traditions of china-pressure tech industry, ever Responsible for what many investors see as the latest breakthrough in ai.
Different Path
Liang was born in 1985 in a rural village in the southern province of guangdong. He Later Obtained Communication Engineering Degrees at the Elite Zhejiang University.
One of his first jobs was running a research department at a smart imaging firm in Shanghai. His then then-Boss, Zhou Chaoen, Told State Media on February 9 that liang had hired prize-willning algorithm enginers and operated with a “flat management style.”
At Deepsek and High-Flyer, liang has been shunned the practices of chinese tech giants know for Rigid top-down management, low pay for Young Employes and “996”-996 “-Working from 9 Am to 9 am to 9 am to 9 am to 9 am to 9 am to 996” A Week.
Liang Opened His Beijing Office Within Walking Distance of Tsinghua University and Peking University, China’s two most prestigious education institutions. He regularly delved into technical details and was happy to work along with gen-z interns and recent graduates that comprised the Bulk of its workforce, according to Two Formmer Employees. They also described usually working eight-history days in a collaborative atmosphere.
“Liang Gave Us Control and Treated Us as Experts. He constantly asked questions and Learned AlongSide Us,” SAID 26-YEAR-old Researcher Benjamin Liu, Who Left the Campany in Septe the Campany in September. “Deepseek allowed me to take ownership of critical parts of the pipeline, which was very exciting.”
Liang did not respond to questions synt via deepsek.
Who and other chinese tech giants were racing to build their consumer-facing versions of chatgpt in 2023 and profit off of the global ai boom, liang told chinese media outlet waves last yaar that are thought Avoided Spending Heavily on App Development, Focusing INTEAD on refining the ai model’s quality.
Both Deepsek and High-Flyer Are Known for Paying Generally, According to Three People Familiar with its compensation practices. At high-flyer, it is not uncommon for a Senior data scientist to make cny 1.5 million (roughly Rs. Quant fund manager who knows liang.
The largesse was funded by high-flyer, which became one of China’s most successful quant funds and, even after a government crackdown on the sector, Still Manages Tens of Billions, According to two people in the industry.
Computing power
Deepsek’s success with a low-cost ai model is based on high-Flyer’s Decade-Long and Substantial Investment in Research and Computing Power, Three People Said.
The quant fund was an earlier pioneer in ai trading and a top executive said in 2020 that high-flyer was going “all in” all in “on ai by re-investing 70 percent of its revionue, mostly into ai reshark.
High-Flyer Spent CNY 1.2 Billion (roughly Rs. 1,441 Crore) Used for training ai models.
Deepseek has not been established at that time, so the accumulation of computing power cavet the Attention of Chinese Securities Regulators, said a person with direction with direction officials.
“Regulators wanted to know why they need so many chips?” The person said. “How they were going to use it? What kind of impact would that have on the market?”
Authorities decided not to intervene, in a movie that would prove crucial for Deepsek’s Fortunes: The Us Banned The Export of A100 Chips to China in 202, at Which Point Fire-Flyer II WAS ALRERADY IN IN Operation.
Beijing now celebrates Deepsek, but have instructed it not to engage with the media without approval, according to a person family with chinese official thinking.
Authorities Had Asked Liang to Keep a Low-Profile Because they were worried that too much Hype in the media would draw unnecessary attention, the person said.
China’s Cabinet and Commerce Ministry, as Well as China’s Securities Regulator, did not respond to requests for comment.
As one of the more companies with a large a100 cluster, high-flyer and Deepsek was ableke was ableke to attract some of China’s Best Research Talent, two former Employes Said.
“The key Advantage of Vast (Computing) Resources is that IT Allows for Large-Scale Experimentation,” said liu, the former employee.
Some western ai entrepreneurs, like scale ai ceo alexandr wang, have claimed that deepsek had as many,,,,, Ed nvidia chips that are banned for export to-china. He has not produced evidence for the allegation or responsible to reuters’ requests to provide proof.
Deepseek has not responded to wang’s claims. Two Former Employees Attributed The Company’s success to liang’s focus on more cost-effective ai architecture.
The Startup Used Techniques Like Mixture-of-Axperts (Moe) and MultiHead Latent Attention (MLA), which incur far lower computing costs, its research papers show.
The moe technique divides an ai model into different area and activities only that that there is related to a query, as opposed to more common architects that use the entrected.
MLA Architecture Allows a Model to Process Different Aspects of One Piece of Information Simultaneously, Helping It Detect Key Details More Effectively.
While competitors like France’s Mistral Have Developed Models Based on Moe, Deepseek was the first firm to depend heavily on this Architecture Whory Achieving Parity with more expected.
Deepsek’s pricing was 20 to 40 times cheaper than what openai charged for equivalent models, analysts at Bernstein brokerage estimated in early February.
For now, western and chinese tech giats have signed plans to continue heavy ai spending, but deepsek’s success with r1 and its earlier v3 model has prompted some to all the altergies.
Openai Cut Pries this month, while google’s gemini has introduced discounted tiers of access. Since R1’s Launch, Openai has also released an o3-mini model that reliaries on less computing power.
Adnan Masood of Us Tech Services Provider Ust Told Reuters that His Laboratory Had Run Benchmarks that Found R1 often Used Three Times as many tokens, or Units of Data Processed by the AI MODEL, For Openai’s scled-not model.
State Embrace
Even before r1 gripped global attention, there was signs that Deepsek has caused caught beijing’s favorite. In January, State Media Reported that Liang Attended A Meeting With Chinese Premier Li Qiarg in Beijing as the designated representative of the ai sector, ahead of the Leaders of the Leaders of the Leaders of the Beetter-KNOWN FIRMS.
The subsequent fanfare over the cost competitiveness of its models have buoyed beijing’s belief that it can out-innovate the us, with chinese companies and government bodies emebering Deeeps A pace that has not been offered to other firms.
At Least 13 Chinese City Governments And 10 State -Owned Energy Companies Say they have deployed deepsek into their systems, which tech giants lenovo, baidu and tensent – Owner of China’s Largest Osa ‘ App Wechat – Have Integrated Deepsek’s models into their products.
Chinese Leader Xi Jinping and Li “Have Signalled They Enderse Deepsek,” said alfred wu, an expert on chinese policymaking at singapore’s lee kuan ywan ywan ywan school of published. “Now everyone just endorses it.”
The Chinese Embrace Comes as Governments from South Korea to Italy Remove Deepsek from National App Stores, Citing Privacy Concerns.
“IF Deepsek Backets the go-to ai model across chinese state entities, western regulators might AI Expert and Founder of Hedge Fund Cartage Capital.
Further Limits on Advanced Ai Chips are a challenge that liang has across
“Our Problem has Never Been Funding,” He Told Waves in July. “It’s the embargo on high-end chips.”
© Thomson Reuters 2025
(This story has not been edited by ndtv staff and is auto-generated from a syndicated feed.)