Champions aren't forever. Last week, DeepSeek AI sent shivers down the spines of investors and tech companies alike with its high-flying performance on the cheap. Now, two computer chip startups are drafting on those vibes.

Cerebras Systems makes huge computer chips—the size of dinner plates—with a radical design. Groq, meanwhile, makes chips tailor-made for large language models. In a head-to-head test, these alt-chips have blown the competition out of the water running a version of DeepSeek's viral AI.

Whereas answers can take minutes to complete on other hardware, Cerebras said that its version of DeepSeek knocked out some coding tasks in as little as 1.5 seconds. According to Artificial Analysis, the company's wafer-scale chips were 57 times faster than competitors running the AI on GPUs and hands down the fastest. That was last week. Yesterday, Groq overtook Cerebras at the top with a new offering.

By the numbers, DeepSeek's advance is more nuanced than it appears, but the trend is real. Even as labs plan to significantly scale up AI models, the algorithms themselves are getting substantially more efficient. On the hardware side, those gains are being matched by Nvidia, but also by chip startups, like Cerebras and Groq, that can outperform on inference.

Big tech is committed to buying more hardware, and Nvidia won't be cast aside soon, but alternatives may begin nibbling at the edges, especially if they can serve AI models faster or cheaper than more traditional options.

Be Reasonable

DeepSeek's new AI, R1, is a "reasoning" model, like OpenAI's o1. This means that instead of spitting out the first answer generated, it chews on the problem, piecing its answer together step by step.

For a casual chat, this doesn't make much difference, but for complex—and valuable—problems, like coding or mathematics, it's a leap forward.

DeepSeek's R1 is already extremely efficient. That was the news last week.

Not only was R1 cheaper to train—allegedly just $6 million (though what this number means is disputed)—it's cheap to run, and its weights and engineering details are open. This is in contrast to headlines about impending investments in proprietary AI efforts that are larger than the Apollo program.

The news gave investors pause—maybe AI won't need as much cash and as many chips as tech leaders think. Nvidia, the likely beneficiary of those investments, took a big stock market hit.

Small, Quick—Still Smart

All that's on the software side, where algorithms are getting cheaper and more efficient. But the chips training or running AI are improving too.

Last year, Groq, a startup founded by Jonathan Ross, the engineer who previously developed Google's in-house AI chips, made headlines with chips tailor-made for large language models. Whereas popular chatbot responses spooled out line-by-line on GPUs, conversations on Groq's chips approached real time.

That was then. The new crop of reasoning AI models takes much longer to provide answers, by design.

Called "test-time compute," these models churn out multiple answers in the background, select the best one, and offer a rationale for their answer. Companies say the answers get better the longer they're allowed to "think." These models don't beat older models across the board, but they've made strides in areas where older algorithms struggle, like math and coding.

As reasoning models shift the focus to inference—the process where a finished AI model processes a user's query—speed and cost matter more. People want answers fast, and they don't want to pay more for them. Here, especially, Nvidia is facing growing competition.