My MAGA dad grew up dirt poor in a house with literal dirt floors in some rooms. Four kids. No dad. Government support. He got a job as a cop and raised me and thinks anyone can get out of poverty if they work hard. He’s a corner case but he’ll never see it tha way because he literally bootstrapped himself. How do I tell him he didn’t and just got lucky? He kinda has a point.
Odd how you counter claims as fictional with zero evidence just speculation. I realize this is a discussion forum and not a policy office but I pisses me off when feelings is met with feelings and not explanation or examples. Now’s your chance to really prove your point.
That we're all less poor today that our ancestors 200 years ago is absolutely, totally self-evident. No additional evidence needed. Indeed, the inverse claim would require evidence, and extraordinary evidence at that.
My wife and I got the duo package because we do a lot of writing and need citations and sources. Compared to google and DDG it is less noisy and returns fewer spammy pages. We’re giving it a year to see if it is worth it.
But why? Training hardware maybe will appear, but everything that's running inference has real, paying customers on it, with current capabilities level. Even if the bubble bursts, why would that demand evaporate? People still would want to use chatgpt, claude code etc., even if they stop getting any better tomorrow.
Yes, but how much do people want to pay to use this stuff? That is the real question.
It might be that more than $30 a month is too much and people stop using it. However, I suspect it would end up a fair bit higher than many would think. If you are a 'whale' (to use the gambling term) then there is a good chance that they could charge in the hundreds of dollars.
Maybe it become a wealth divide between those that can afford it and those who cannot. In equality yet again. Then the echos of Dune, the 'Butlerian jihad' and ‘Thou shalt not make a machine in the likeness of a man’s mind' will start to come up.
I'd expect at least API pricing for all major players to be margin-positive. Positive ROI if you include model training cost? Maybe not, but in a hypothetical scenario where all capital for new model training goes away, the latest frontier model weights still continue to exist.
Subscription prices for claude code et. al? No idea.
No, that's a rumor lots of people have been taking at face value.
If you do the math, inferrence is very lucrative.
Here someone deployed a big model, the costs are $0.20/1M token
https://lmsys.org/blog/2025-05-05-large-scale-ep/
If some large companies implode local models will fill the gaps soon enough. I can see a crash in data center construction and operation once the economics get too far out of whack.
I was an intel cpu architect when transmeta started making claims. We were baffled by those claims. We were pushing the limit of our pipelines to get incremental gains and they were claiming to beat a dedicated arch on the fly! None of their claims made sense to ANYONE with a shred of cpu arch experience. I think your summary has rose colored lenses, or reflects the layman’s perspective.
I think this is a classic hill-climbing dilemma. If you start in the same place, and one org has worked very hard and spent a lot of money optimizing the system, they will probably come out on top. But if you start in a different place, reimagining the problem from first principles, you may or may not find yourself with a taller hill to climb. Decisions made very early on in your hill-climbing process lock you in to a path, and then the people tasked with optimizing the system later can't fight the organizational inertia to backtrack and pick a different path. But a new startup can.
It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).
Transmeta's mistake was that they didn't re-examine enough assumptions. They assumed they were building a CPU rather than an industry. If they'd backed up even farther they would've found that there actually was fertile territory there.
> It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).
Except "the cloud" at that point was specifically just a large number of normal desktop-architecture machines. Specifically not a new ISA or machine type, running entirely normal OS and libraries. At no point did Google or Amazon or Microsoft make people port/rewrite all of their software for cloud deployment.
At the point that Google's "bunch of cheap computers" was new, CPU performance was still rapidly improving. The competition was traditional "big iron" or mainframe systems, and the novelty was in achieving high reliability through distribution, rather than building on fault-tolerant hardware. By the time the rate of CPU performance improvement was slowing in the mid 2000s, large clusters smaller machines were omnipresent in supercomputing and HPC applications.
The real "new architecture(s)" of this century are GPUs, but much of the development and success of them is the result of many iterations and a lot of convergent evolution.
> At the point that Google's "bunch of cheap computers" was new
It wasn't even new, people just don't know the history. Inktomi and HotBot were based on a fleet of commodity PC servers with low reliability, whereas other large web properties of the time were buying big iron like Sun E10K. And of course Beowulf clusters were a thing.
And as far as I know, google's early ethos didn't come as some far sighted strategy, but just the practical reality of Page and Brin building the first versions of their search engine on borrowed/scavenged hardware as grad students and then continuing that trajectory.
Not revisionist I think just more that a lot of people first encountered the concept with the story of Google and don't know it had plenty of precedent.
Wasn't Intel trying to do something similar in Itanium i.e. use software to translate code into VLIW instructions to exploit many parallel execution units? Only they wanted the C++ compiler to do it rather than a dynamic recompiler? At least some people in Intel thought that was a good idea.
I wonder if the x86 teams at Intel people were similarly baffled by that.
EPIC aka Itanium was conceived around trace optimizing compilers being able to find enough instruction level parallelism to pack operations into VLIW bundles, as this would eliminate the increasingly complex and expensive machinery necessary to do out of order superscalar execution.
This wasn't a proven idea at the time, but it also wasn't considered trivially wrong.
What happened is that the combination of OoO speculation, branch predictors, and fat caches ended up working a lot better than anticipated. In particular branch predictors went from fairly naive assumptions initially to shockingly good predictions on real world code.
The result is that conventional designs increasingly trounced Itanium as the latter was still baking in the oven. By the time it was shipping it was clear the concept was off target, but at that point Intel/HP et all had committed so much they tried to just bully the market into making it work. The later versions of Itanium ended up adding branch prediction and more cache capacity as a capitulation to reality, but that wasn't enough to save the platform.
Transmeta was making a slightly different bet, which is that x86 code could be dynamically translated to run efficiently on a VLIW cpu. The goal here was two fold:
First, to sidestep IP issues around shipping an x86 compatible chip. There's a reason AMD and Cyrix are the only companies to have shipped intel alternatives in volume in that era. Transmeta didn't have the legal cover they did, so this dynamic translation approach sidestepped a lot of potential litigation.
Second, dynamic translation to VLIW could in theory be more power efficient than a conventional architecture. VLIW at the hardware level is kinda like if a cpu just didn't have a decoder. Everything being statically scheduled also reduces design pressure on register file ports, etc. This is why VLIW is quite successful in embedded DSP style stuff. In theory, because the dynamic translation pays the cost of compiling a block once then calls that block many times, you could get a net efficiency gain despite the cost of the initial translation. Additionally, having access to dynamic profiling information could in theory counterbalance the problems EPIC/Itanium ran into.
So this also wasn't a trivially bad idea at the time. Transmeta specifically targeted x86 compatible laptops as that was a bit of a sore point in the Wintel world at the time, where the potential power efficiency benefits could motivate sales even if absolute performance still was inferior to intel.
From what I recall hearing from people who had them at the time, the Transmeta hardware wasn't bad but had the sort of random compatibility issues you'd expect and otherwise wasn't compelling enough to win in the market vs Intel. Note this was also before ARM rose to dominate low power mobile computing.
Transmeta ultimately failed, but some of their technical concepts in detail have been continued on in how language JITs and GPU shader IRs work today. Or how Apple used translation to migrate off both PowerPC and x86 in turn.
In both the case of Itanium and Transmeta I'd say it's historically inaccurate to say they were obviously or trivially wrong at the time people made these bets.
The Itanium felt like Intel trying on the same bet - move the speculative and analysis logic into the compiler and off the CPU. But where it differed is that it tried to leave some internal implementation details of that decoding process exposed so the compiler could call it directly, in a way that transmeta didn’t manage.
The discussions on comp.arch from that era are a gold mine. There were lead architects from the P4 team, from the Alpha team, Linus himself during his Transmeta days... all talking very frankly about the concerns of computer architecture at the time.
Not completely baffling. Intel made an attempt to create a Transmeta like hybrid software/hardware architecture at the time on one of their "VLIW" processors. It was an expensive experiment that didn't work out.
I recall one of the biggest concerns around the time was that OOOE techniques would not continue scaling in width or depth, and that other techniques would be needed. This turned out to be true, but it was not some fringe idea -- the entire industry turned on this. Intel designed the narrow and less "brainy" Pentium 4 and hoped to achieve performance with frequency, and with HP they designed the in-order Itanium lines. AMD did some speed demon K9. IBM did the in-order POWER6 that got performance with high frequency and runahead speculative execution. Nvidia did a similar thing to Transmeta too, quite a while later IIRC.
All failures. Everybody went back to more conventional out of order designs and were able to find ways to keep scaling those.
I'm sure there were some people at all these companies who were always OOOE proponents and disagreed with these other approaches, but I think your summary has poop colored lenses :) It's a little uncharitable to say their ideas were nonsense. The reality is that this was a very uncertain and exploratory time, and many people with large shreds of cpu arch experience all did wildly different things, and many went down the wrong roads (with hindsight).
We’re all retired in some sense. Some on farms, some elsewhere. But in a broader and deeper and more meaningful level, we’re not retired at all…working harder than ever.