Do you think its a cost/control to not buy big iron (outsource to a degree) vs b...

apaprocki · on April 11, 2014

"Big iron" isn't all it's cracked up to be. Everything is a trade-off. Very few people are doing pure computation and that is where those machines excel (in addition to lots of aggregate I/O). The government research labs and the like get a lot of use from these machines.

If you trying to scale an Internet-style app on one of these machines, you might need to expand past one machine after a while. By staying on one machine, you're avoiding all the complexity needed in your software to coordinate between multiple machines. If you lose the ability to fit on a single box, you'll need to add that complexity in anyway. So what does 10 beefy boxes buy you as opposed to 1000 smaller ones? There is of course an operational/DC/power cost involved with more boxes, but I think most shops consider that an easily solvable problem. For example, a maxxed out POWER7 box from IBM will give you 256 processors and all the memory and I/O trimmings you need. If you need more than 256 processors or the local amount of RAM, you'll pay the software complexity cost anyway.

jbangert · on April 11, 2014

Well, the 10 beefy boxes will be much, much faster if your problem is not very distributable. Say, Facebook as an application shards very easily, because most users don't interact much with each other. Other applications, might have much more interactions.

What you're really paying for when buying a 256 processor POWER7 box is the fact that the interconnect (and therefore the time to acquire a lock/update data from another node) is much faster and more reliable than commodity networks/kernels/stack.

srean · on April 11, 2014

Depends on what you are programming on. If its in a language far removed from the machine your mileage may vary.

I have had the opportunity to try out Google's implementation of mapreduce implemented in C++ way back in time (6 years ago). These would run on fairly impoverished processors, essentially laptop grade. Have done stuff on Yahoo's Hadoop setup as well, these used high end multicore machines provisioned with oodles of RAM (I dont think I should share more than that). If I were to be generous, Hadoop ran 4 times slower as measured by wall clock times. Not only that, Hadoop required about 4 times more memory for similar sized jobs. So you ended up requiring more RAM, running for longer and potentially burning more electricity. This is by no means a benchmark or anything like that, just an anecdote.

That Hadoop would require much more memory did not surprise me, that was expected. What was really surprising was that it was so much slower. JVM is one of the most well optimized virtual machines we have out there, but its view of the processor is very antiquated and it does not surface those hardware level advances to the programmer. You pay for a hot-rod machine but run it like an old faithful crown victoria.

Four times might not seem like much, for one thing I am being generous, and it makes a big difference when you can make multiple run through the data in a single day and make changes to the code/model. Debugging and ironing out issues is a lot more efficient.

I think Hadoop gave Google a significant competitive advantage over the rest, probably still does.

apaprocki · on April 11, 2014

Interconnect may be faster but as a whole system it is hard to compete with the raw speed of an x64 box with all the latest/greatest chipsets. You usually wind up having to write non-portable code to eke full performance out of the massive box and in the end your apps will probably still be faster on x64. They're best suited for massive parallel computation that isn't afraid of getting down to the metal and taking advantage of lots of the special chip instructions in asm. (Or alternatively you want POWER specifically because it has hardware dfp support.) The total gain from running on x64 will most likely exceed any loss from a network hop in a case where both have to go off to SAN for their data.

qq66 · on April 11, 2014

Yes. Facebook's "cost of revenue" (which they state is mostly infrastructure) was $1.875 billion in 2013, a year when they made $1.5 billion in net income. For comparison, research and development was $1.4 billion.

Facebook's business model involves getting 1 billion people to post a ton of stuff inside Facebook, costing them about $2/user/year in infrastructure, $3.50/user/year in other costs, and making about $7/user/year in advertising revenue, yielding about $1.50 in profit. So cutting costs on that $2 makes them significantly more profitable.

applecore · on April 10, 2014

If you're a pure technology company, like Google and Facebook, you're not going to outsource your core competency. It's not an issue of costs; buying "big iron", even if they could, would be akin to style drift for an investment manager.

rdl · on April 10, 2014

You could definitely still do a lot less in-house than FB does, and be successful. FB seems to delight in building tools and infrastructure.

Bloomberg is probably a better example of a company which builds "optional" technology in house, just to be awesome, though -- they're not at the scale of FB (where "traditional" solutions break down), but from what I've seen, they do a lot of interesting work in-house because their staff want to do it, and because it lets them have really top-quality staff in a highly competitive market.

limelight · on April 11, 2014

> they're not at the scale of FB (where "traditional" solutions break down)

I'm not so sure about that. Bloomberg processes an incredible amount of data, and they have strict latency requirements. In many cases, traditional solutions would in fact break down under those requirements.

apaprocki · on April 11, 2014

I work on the infrastructure team at Bloomberg. There are lots of problems solved by OSS, but there are also lots of pieces of infrastructure we have to build ourselves to scale the things we need to. Latency is killer, indeed. (Low-latency data aggregation/generation/distribution is only one part of the business, though.)