Vcore is hyperthread of unknown CPU. So in reality 1000 vcores is 500 real cores...

zbjornson · on Jan 6, 2016

Briefly... We have many data sets, and the <10sec calculations happen every few seconds for every data set in active use. Caching results is rarely helpful in our case because the number of possible results is immense. The back end drives an interactive/real-time experience for the user, so we need the speed. Our loads are somewhat spikey; overnight in US time zones we're very quiet, and during daytime we can use more than 1k vCPUs.

We've considered a few kinds of platforms (AWS spot fleet/GCE autoscaled preemptible VMs, AWS Lambda, bare metal hosting, even Beowulf clusters), and while bare metal has its benefits as you've pointed out, at our current stage it doesn't make sense for us financially.

I omitted from the blog post that we don't rely exclusively on object storage services because its performance is relatively low. We cache files on compute nodes so we avoid that "80% of time is spent reading data" a lot of the time.

(Re: Netflix, in qaq's other comment, I don't have a hard number for this, but I thought a typical AWS data center is only under 20-30% load at any given time.)