The actual price for developing new models of Deepsac is unknown, however, since a figure quoted in the same research paper cannot capture the entire picture of its costs. “I don't believe it is $ 6 million, but even though it is $ 60 million, it is a game changer,” says Umesh Pawal, Managing Director of Thomavest Ventures, a company that has invested in Kohere and other AI firms. . “It will put pressure on the profitability of companies that consumers focus on AI.”
Shortly after Deepsek disclosed the details of its latest model, Ghodasi of Databricics says that the customers started asking if they use it and cut the cost of the underlying Deepsek in their organizations. Can also do for. He says that a approach employed by Dipsek engineers, known as distillation, involves using an output from a large language model, which is relatively cheap and straightforward, to train another model.
Padwal says that the existence of models like Deepsek will eventually benefit the companies incurring less spending on AI, but they say that many firms may have reservation about relying on a Chinese model for sensitive tasks. So far, at least one major AI firm is Perplexity Publicly declared It is using Deepsek's R1 model, but says it is being “hosted” completely independent “from China.
Amjad Massad, CEO of AMI, a startup, which offers AI coding tools, told Wired that he feels that the latest models of Deepsek are impressive. While he still finds that the sonnet model of anthropic is better in many computer engineering works, he has found that R1 is specifically good in converting the text command into code that can be executed on the computer. “We are especially using it for agent logic,” he says.
Deepsek's latest two prasad- Deepsek R1 and Dipsek R1-Zero- are capable of the same type of fake argument as the most advanced system from Openi and Google. They work by breaking problems in all component parts so that to deal with them more effectively, a procedure that has a large amount of additional training to ensure that AI reaches the correct answer firmly.
A paper The approach used by the company to be used to create its R1 model last week posted by Deepsek researchers has been underlined, claiming that it performs on some benchmarks as well as Opeai's groundbreaking region model Is known as O1. The given strategy has been used, which includes a more automatic method for learning a strategy to transfer the problem properly as well as to transfer skills from large models.
One of the hottest themes of speculation about deepsek is that hardware can be used. The question is particularly notable because the US government has introduced a series export control And in the last few years the purpose of other trade restrictions is aimed at the purpose of China's ability to achieve and manufacture the state -of -the -art chips that are essential for the manufacture of advanced AI.
One in Research paper Since August 2024, Deepsek indicated that it has access to a cluster of 10,000 Nvidia A100 chips, which were under us. restrictions Announced in October 2022. Separate paper Since June of that year, Deepsek said that an earlier model named Dipsek-V2 was developed using groups of NVDia H800 computer chips, which NVDia developed a low capable component developed by Nvidia to follow US export controls Is.
A source of an AI company that trains large AI models, who asked to be anonymous to protect their professional relationships, estimates that Deepsek used about 50,000 Nvidia chips to create its technology.
NVIDIA refused to comment directly as to which chips of her chips could be trusted. A spokesman from NVidia said in a statement, “Deepsek is an excellent AI advancement,” the argument approach of the startup requires “NVDia GPU and a significant number of high-demonstration networking.”
Although models of Deepsac were created, they show that AI is gaining a low closed approach to develop. In December, Hugging Fes CEO Claim Delangue, a platform that hosters artificial intelligence models, It is predicted that A Chinese company takes an edge in AI due to the speed of innovation in the open source model, which China has hugged to a large extent. “It went more faster than what I thought,” they say.