> We’re growing at a rate of over 400 new users and 1000 new repositories every day and these rates are only increasing with time
Crazy. Interesting how the repo growth is faster than the new user growth. That's a lot of projects :)
The point-by-point listing of requirements in this post almost reads like an outline of what EY cannot do. I know that the split is supposedly amicable (on the surface), but this doesn't look so great for EY.
Also interesting is how the requirements of a given application gradually shatter through various ceilings of performance requirements, one notable one here being the following:
> The benefits of running bare metal are obvious and have been empirically proven. We need to have the option to run bare metal when it is appropriate to the task at hand
The split probably was amicable, frankly. No one disputes that Rackspace can do things that EY can't at this point. People outgrow things all the time. You start just accepting PayPal, and then you add credit cards when enough users want that. You start with an AIR app, but go to dedicated Windows/Mac/Linux apps when you have enough users on each that are worth it.
There's been nothing that I've read - anywhere - that would suggest that someone else shouldn't use EY for their Rails app.
It's probably coincidence, but I noticed that all of those other beginning technologies you mention are cheaper than the next step up. Not so with Engine Yard.
It's like buying a Porsche, realizing you need more seats, and then switching to a nice, reliable, high-end Civic.
Yes, and I was told by someone that once worked for EY that their price was the only selling point. In other words, their product wasn't better, but their price made a lot of folks believe that it was.
If they hadn't charged more than everyone else and hired Ezra and a few other community luminaries, they'd be just another niche hosting company.
Wait, Rackspace is more expensive than their EngineYard costs, which were free.
And, actually, processing credit cards yourself is cheaper than PayPal, but there's the setup step that takes time and up-front costs.
And your last analogy is really faulty, Dustin. If you need more seats, your needs have changed. Then you re-evaluate the options based on your new criteria. That's what Github did.
Man, I thought I got through to you during our train ride. :)
You're correct that Rackspace is more expensive than our current arrangement with Engine Yard, but it would have been quite the opposite had we elected to stay.
> Crazy. Interesting how the repo growth is faster than the new user growth. That's a lot of projects :)
Not that surprising. I see a lot of people fork repos only to do nothing with them. It's like forking on github is a 'packrat-ish' form of the 'watching' feature.
I'd be more interested in a number that excluded either all forked repos, or forked repos with no new commits since the fork.
That's impressive then. I wonder if there are any stats on the size of these repos (i.e. are people creating new repos for everything -- even tiny one-off scripts -- rather than just rolling the smaller stuff into a single repo?)
RedHat GFS. It works great for low and medium IO situations, but can cause sporadic issues on high IO installations like ours, and we aren't able to attach any more servers to it without further impacting performance.
Given our experience with GlusterFS, I imagine Github would have crashed and burned a long time ago if they were using it. Our use-case (checking if a file exist and grabbing images to occasionally resizing them) ran into all sorts of bizarre issues. We had about 1 million small files.
"If you want something done right, do it yourself." still applies to deployment, apparently. I wonder if/when the "cloud" will grow up. Any bets on how many years until rolling your own deployment infrastructure seems silly?
Hiring Rackspace to handle ops isn't exactly "doing it yourself," either. It's just paying for a custom solution instead of using the generic one-size-fits-all "cloud."
That is far from true. There are lots of things to do to get a site up and running that don't require managing infrastructure.
If you spend time "doing it yourself," then you don't spend a lot of time and money on other parts of the business or the site. There is a scalability path that includes cloud computing.
If you have a large complex site like GitHub with an architecture that isn't conveniently accommodated by your cloud host and you can do it yourself, then it may be better.
It's a lot cheaper to run on your own hardware, but maybe there's a scale issue here as well -- if you're big enough you might get a cheaper price per unit.
Still not always true. It's not a lot cheaper to run on your own hardware. Have you factored in shipping costs, depreciation, infrastructure management time, distractions, opportunity costs, colocation costs, interest.
These things add up.
Imagine comparing a $10/mo hosting account with a colocated server for $100/mo.
There are a plethora of scenarios that don't warrant owning or running your own hardware.
I obviously don't have numbers for rackspace's dedicated servers, but their 15.5GiB slice is $800 a month. Are they going to charge less for dedicated than for virtual hosts? You can get a server with 8 cores and 16Gib ram for under $2K from places like serversdirect.com, or building from parts. I build from parts, and generally end up paying $1500 for a 32GiB ram box. co-location is another $150/month or so. (my costs are closer to $75/month for a dual-socket box, but I have two full racks)
Unless you screwed up assembling the thing, the chances of something other than a drive failing are pretty small. I mean, not zero; don't count on hardware... I'm just saying, even if you have to get a $150/hr guy in to fix your bad hardware, that just doesn't happen very often. way less than once per server. Way less than once per every 10 servers (assuming you didn't screw up the assembly, and that you rotate out the server after 3 years.)
So yeah, there are many situations where it makes sense to rent, especially if you can't use 8 cores/ 32GiB ram of capacity. but if you use a whole server (32GiB ram/8 core; go virtual until you need that capacity, I say.) the cost advantages of owning hardware yourself are overwhelming. Sure, there are reasons to rent... I'm just saying, owning ends up costing you a /lot/ less money.
Yep, my experience with EC2 is that it gets pricey fast, and that the slower instances aren't really that speedy. The sweet spot is probably where you don't need critical performance (like a database box with a really fast RAID array).
My solution is to run most stuff in a colo on dell boxes running the linux-vserver kernel patch, sort of a cheap mans cloud solution.
Don't worry, I'll be making some technical blog posts down the road explaining in excruciating detail exactly how I've done the federated architecture.
Crazy. Interesting how the repo growth is faster than the new user growth. That's a lot of projects :)
The point-by-point listing of requirements in this post almost reads like an outline of what EY cannot do. I know that the split is supposedly amicable (on the surface), but this doesn't look so great for EY.
Also interesting is how the requirements of a given application gradually shatter through various ceilings of performance requirements, one notable one here being the following:
> The benefits of running bare metal are obvious and have been empirically proven. We need to have the option to run bare metal when it is appropriate to the task at hand