Monday, February 05, 2007

scaling rails

I've said it before, but it bears repeating: There's nothing interesting about how Ruby on Rails scales. We've gone the easy route and merely followed what makes Yahoo!, LiveJournal, and other high-profile LAMP stacks scale high and mighty.

Take state out of the application servers and push it to database/memcached/shared network drive (that's the whole Shared Nothing thang). Use load balancers between your tiers, so you have load balancers -> web servers -> load balancers -> app servers -> load balancers -> database/memcached/shared network drive servers. (Past the entry point, load balancers can just be software, like haproxy).

In a setup like that, you can add almost any number of web and app servers without changing a thing.

Scaling the database is the "hard part", but still a solved problem. Once you get beyond what can be easily managed by a descent master-slave setup (and that'll probably take millions and millions of pageviews per day), you start doing partitioning.

Users 1-100K on cluster A, 100K-200K on cluster B, and so on. But again, this is nothing new. LiveJournal scales like that. I hear eBay too. And probably everyone else that has to deal with huge numbers.

So the scaling part is solved. What's left is judging whether the economics of it are sensible to you. And that's really a performance issue, not a scalability one.

If your app server costs $500 per month (like our dual xeons does) and can drive 30 requests/second on Rails and 60 requests/second on Java/PHP/.NET/whatever (these are totally arbitrary numbers pulled out of my...), then you're faced with the cost of $500 for 2.6 million requests/day on the Rails setup and $250 for the same on the other one.

Now. How much is productivity worth to you? Let's just take a $60K/year programmer. That's $5K/month. If you need to handle 5 million requests/day, your programmer needs to be 10% more productive on Rails to make it even. If he's 15% more productive, you're up $250. And this is not even considering the joy and happiness programmers derive from working with more productive tools (nor that people have claimed to be many times more productive).

Of course, the silly math above hinges on the assumption that the whatever stack is twice as fast as Rails. That’s a very big if. And totally dependent on the application, the people, and so on. Some have found Rails to be as fast or faster than comparable “best-of-breed J2EE stacks” — see http://weblog.rubyonrails.com/archives/2005/04/04/justingehtland-is-back-with-numbers-to-back-it-up/

The point is that the cost per request is plummeting, but the cost of programming is not. Thus, we have to find ways to trade efficiency in the runtime for efficiency in the “thought time” in order to make the development of applications cheaper. I believed we’ve long since entered an age where simplicity of development and maintenance is where the real value lies.
David Heinemeier Hansson
Tuesday, July 12, 2005

No comments: