The stock price had assumptions baked in about the number of units expected to b...

aoanevdus · on Jan 30, 2025

I don’t get it. The labs have regularly made improvements that dramatically lower the cost of training an equal-performing model. When they do this, they also train a larger model with even higher performance. This time, DeepSeek did the first part but didn’t do the second. Now every lab in the world will throw their compute into the effort to replicate and beat DeepSeek’s model with larger scale. It’s not like everyone is just going to say “well I guess AI is smart enough now, no point improving it anymore!” and stop building bigger training clusters.

If anything, r1 makes even more GPU demand likely, since it mitigated or at least delayed the risk AI hit a dead end (in which case, ceasing development may actually make sense).

gleenn · on Jan 30, 2025

Define dramatically with numbers. From all the sources I've read, it was so significant and also run on a far more limited cluster and the results are as good as the other frontier models. Optimizations have been coming, I think the one or more they found were significantly larger.

sampullman · on Jan 30, 2025

It still doesn't make sense to me. If the money for training is still there, wouldn't companies that can afford it use the efficiency gains and also scale up models?

Unless AI is a bubble, and it pops, I can't see the demand for compute going down.

EGreg · on Jan 30, 2025

I think AI is a bubble. The amount of compute for inference is vastly overestimated, because a lot of caching is coming. It's driven by maniacal statements like Sam Altman's insistence that we must spend Trillions on compute, to achieve AGI, and it's more important than anything else.

Project Stargate is some large fraction of that, and of course Softbank is no stranger to losing money on overestimating demand (for example, WeWork). To be fair, China has a lot of overestimation of demand too (for example Evergrande). The other is that rapid competition leads to overinvestment by all parties.

https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-do...

disgruntledphd2 · on Jan 30, 2025

Which is great for us, we'll have loads of cheap compute and hopefully a bunch more carbon free energy supply, assuming that the AI stuff all ends in tears (for now).

EGreg · on Jan 30, 2025

Yep! Shareholders and capitalists overinvesting in stuff is great if it leaves behind great infrastructure. They take the risk and the public benefits.

gleenn · on Jan 30, 2025

There is a belief that we've peaked in terms of bigger model = better results. I think it was GPT 4 is actually smaller in parameter count and better than 3.5 for instance. There is also a finite amount of useful data that people think we have hit, so adding more parameters isn't helpful if you don't have new data to train them on.

MichaelZuo · on Jan 30, 2025

They would scale things up slightly slower.

And money 5 years from now is simply worth less to markets than money 7 years from now.

Aunche · on Jan 30, 2025

Eli Whitney thought he could reduce slavery by making cotton processing 45x more efficient...

stavros · on Jan 30, 2025

Can someone explain how DeepSeek cut that estimate? Their (fast) API is always down, and the third-party providers on OpenRouter are more expensive than Claude.

juliuskiesian · on Jan 31, 2025

They are currently under severe cyberattack from the US.