Today, Deepsek is one of the only leading AI firms of China, which does not depend on money from technical giants such as Badu, Alibaba or Bidens.
A young group of talents is eager to prove themselves
According to Leiang, when he put the research team of Deepsak together, he was not looking for experienced engineers to create consumer-support products. Instead, he focused on PhD students from top universities in China, including Peking University and Singhua University, who were eager to prove themselves. Many were published in the top magazines and won the awards at international academic conferences, but according to this there was a lack of industry experience. Chinese tech publication Qbitai.
“Our main technical positions are filled by most people who have graduated in this year or in the last one or two years,” Liang told 36kr in 2023Hiring Strategy helped creating culture, where people were free to use adequate computing resources to carry forward unconventional research projects. This is a different way of operating from internet companies established in China, where teams are often competing for resources. (A recent example: Bidens accused a former intern-A prestigious Academic Award winner, there is no less to break the work of their colleagues to establish more computing resources for their team.)
Leiang said that students could be a better fit for high-investment, low-profit research. “Most people, when they are young, can devot themselves to a mission without completely utilitarian ideas,” they explained. His pitch on future rent is that Deepsek was designed to “solve the most difficult questions in the world”.
Experts say that these young researchers are almost completely educated in China. Zhang explains, “This younger generation is a symbol of patriotism, especially when they navigate American restrictions and give choke points in significant hardware and software technologies,” Zhang explains. “Their determination to remove these obstacles not only reflects personal ambition, but is also a comprehensive commitment to pursue China's position as a global innovation leader.”
Innovation arose from a crisis
In October 2022, the US government began putting export controls together, which severely restricted Chinese AI companies when NVIDIA reaches state -of -the -art chips. The move presented a problem for Deepsek. The firm started with a stockpile of 10,000 H100, but needed more to compete with firms such as Openai and Meta. “The problem we are facing is never funding, but export control over advanced chips,” Liang told 36kr In another interview in 2024,
Deepsek had to come up with more efficient ways to train his models. Chang, a software Chang, said, “They adapted their model architecture using a battery of engineering tricks-Communication plans between the communication plans, reducing the size of the field to save memory, and the mix-off-model approach Innovative use of, “a software engineer stated that a software engineer turn policy. Analysts at Mercator Institute for China Studies. “Many of these approaches are not new ideas, but successfully combining them to produce state -of -the -art models is a remarkable achievement.”
Deepsek has also made significant progress on multi-headed latent attention (MLA) and mixture-of-experts, two technical designs, which make the deepsek model more cost effective than the need for less computing resources to train. In fact, the latest model of Deepsek is so efficient that it requires computing power of comparable lama 3.1 model of meta, to train it, According to Research Institution Appoch AI,
Deepsak's desire to share these innovations with the public has earned it a lot of goodwill within the Global AI Research Community. For many Chinese AI companies, developing an open source model is the only way to play catch-up with their western counterparts, as it attracts more users and contributors, which in turn help the model grow. “Now they have demonstrated that state-of-the-art models can be made using low, although still a lot of money and that the current criteria of model-building leave a lot of space for adaptation,” Chang says. “We are sure to see a lot of efforts in this direction.”
This news may face trouble for current US export controls that focus on creating obstacles of computing resources. “AI computing power China has an existing estimate, and what they can achieve with it,” says Chang.