Yes. If you look at the diagram that plots the performance vs the amount of output tokens, you can see that R1T2 uses about 1/3 of the output tokens that R1-0528 uses.
Keep in mind, the speed improvement doesn’t come from the model running any faster (it’s the exact same architecture as R1, after all) but from using less output tokens while still achieving very good results.
Fair point. More benchmarks are definitely good but I’m optimistic that they will show similar results.
Anecdotally, I can say that my personal experience with the model is in line with what the benchmarks claim: It’s a bit smarter than R1, a bit faster than R1, much faster than R1-0528, but not quite as smart. (Faster meaning less output tokens). For me, it’s at a sweet spot and I use it as daily driver.