It's really training not inference that drains the lakes.

littlestymaar · 2025-08-06T00:58:09 1754441889

Training cost has increased a ton exactly because inference cost is the biggest problem: models are now trained on almost three orders of magnitude more data then what is compute-optimal to do (from the Chinchilla paper), because saving compute on inference makes it valuable to overtrain a smaller model to achieve similar performance for a bigger amount of training compute.

JKCalhoun · 2025-08-05T22:31:19 1754433079

Interesting. I understand that, but I don't know to what degree.

I mean the training, while expensive, is done once. The inference … besides being done by perhaps millions of clients, is done for, well, the life of the model anyway. Surely that adds up.

It's hard to know, but I assume the user taking up the burden of the inference is perhaps doing so more efficiently? I mean, when I run a local model, it is plodding along — not as quick as the online model. So, slow and therefore I assume necessarily more power efficient.

nudgeOrnurture · 2025-08-07T04:22:52 1754540572

you found a way to train only once until it "just works"?