-
Speeding up a Rails request by 150ms by changing 1 line
2009-05-08 07:59 in /tech/ruby
We’re pretty obsessed with performance at Gilt Groupe. You can get a taste for what we’re dealing with, and how we’re dealing with it, from our recent presentation at RailsConf.
One of the techniques we’re using is to precompute what certain high-volume pages will look like at a given time in the future, and store the result as static HTML that we serve to the actual users at that time. For ease of initial development, and because there’s still a fair bit of business logic involved in determining which version of a particular page to serve, this was done inside our normal controller actions which look for a static file to serve, before falling back to generating it dynamically.
We’re now running on Rails 2.3 and, of course, Rails Metal is the new hotness in 2.3. I spent the last couple days looking into how much improvement in static file serving we would see by moving it into the Metal layer. Based on most of what I’ve read, I expected we might shave off a couple milliseconds. This expectation turned out to be dramatically wrong.
Metal components operate outside the realm of the usual Rails timing and logging components, so you don’t get any internal measurements of page performance. Instead, I fired up ab to measure the serving times externally. What I found for the page I was benchmarking was that the Metal implementation took about 5ms. The old controller action took 170ms. But, wait... the Rails logs were only reporting 8ms for that action. Something was fishy.
I started inserting timers at various places in the Rails stack, trying to figure out where the other 160ms was going. A little bit was routing logic and other miscellaneous overhead, but even setting a timer around the very entry points into the Rails request serving path, I was only seeing 15ms being spent. This was getting really puzzling, because at this point where a Rack response is returned to the web server, I expected things to look identical between Metal and ActionController. However, looking more closely at the response objects I discovered the critical difference. The Metal response returns an
[String]
, while the controller returned an ActionController::Response.I went into the Rails source and found the
each
method for ActionController::Response. Here it is:def each(&callback) if @body.respond_to?(:call) @writer = lambda { |x| callback.call(x) } @body.call(self, self) elsif @body.is_a?(String) @body.each_line(&callback) else @body.each(&callback) end @writer = callback @block.call(self) if @block end
The critical line is the case where the body is a String. The code iterates over each line in the response. Each line is written individually to the network socket. In the case of the particular page I was looking at, that was 1300 writes. Ouch.
To confirm this was the problem, I changed that line to
yield @body
With the whole body being sent in a single write, ab reported 15ms per request, right in line with what I measured inside Rails.
1 line changed. 150ms gained. Not too bad.
This sort of performance pessimization we uncovered is particularly insidious because it’s completely invisible to all the usual Rails monitoring tools. It doesn’t show up in your logged response time; you won’t see it in NewRelic or TuneUp. The only way you’re going to find out about it is by running an external benchmarking tool. Of course, this is always a good idea, but it’s easy to forget to do it, because the tools that work inside the Rails ecosystem are so nice. But the lesson here is, if you’re working on performance optimizations, make sure to always get a second opinion.
-
A Rails 2.3 Performance Gotcha
2009-04-27 14:30 in /tech/ruby
Recently we upgraded to Rails 2.3 at work. Upon doing so, we saw our application take an aggregate performance hit of about 50%. Spending a little quality time with ruby-prof, one thing that stood out as peculiar was an awful lot of time being spent creating and tending to Ruby Thread objects. Tracing up the call stack, these were coming from inside the memcached client.
Buried in the Rails 2.3 release notes is the innocuous looking statement:
The bundled memcached client has been updated to version 1.6.4.99.
What this fails to tell you is that this upgrade added a timeout option to the client which, as the RDoc will happily inform you, “is a major performance penalty in Ruby 1.8”. Moreover, the default value for the timeout is 0.5s, rather than nil.
Our application makes heavy use of caching, so we were getting massacred by this. To add insult to injury, our most heavily optimized pages were hit the hardest, since they used memcached the most. Adding
:timeout => nil
to our cache layer immediately restored the previous performance.Lessons from this story: if you’re a Rails developer using memcached, go add that option to your code; if you’re a library author adding an option with significant performance costs, default to the old behavior; if you’re on the Ruby core team, please add an interface to
select
so that people don’t do silly things like use Threads to implement timeouts on socket reads.