For Rails 3, we wanted to take the performance optimizations in Merb and bring them over to Rails. In this post, I’ll talk about just a few of the performance optimizations we’ve added to Rails 3: reducing general controller overhead and (greatly) speeding up rendering a collection of partials.
For our initial performance work, we focused on a few specific but commonly used parts of Rails:
- General overhead (the router plus the cost of getting in and out of a controller)
- render :text
- render :template
- render :partial
- rendering a number (10 and 100) of the same partials in a loop
- rendering a number (10 and 100) of the same partials via the collection feature
This was definitely a limited evaluation, but it covered most of the cases where performance might be at a premium and the Rails developer was unable to do anything about it.
General Controller Overhead
The first thing was improving the general overhead of a Rails controller. Rails 2.3 doesn’t have any way to test this, because you’re forced to use
render :string to send back text to the client, which implicates the render pipeline. Still, we wanted to reduce it as much as possible.When doing this work, we used Stefan Kaes’ fork of ruby-prof that comes with the
CallStackPrinter (the best way I’ve ever seen to visualize profile data from a Ruby application.) We also wrote a number of benchmarks that could double as profile runs if I wanted to zero in and get more precise data.When we looked at overhead, it was dominated by setting the response. Digging a bit deeper, it turned out that ActionController was setting headers directly, which then needed to be re-parsed before returning the response to get additional information. A good example of this phenomenon was in the
Content-Type header, which had two components (the content-type itself and an optional charset). The two components were available on the Response object as getters and setters:def content_type=(mime_type) self.headers["Content-Type"] = if mime_type =~ /charset/ || (c = charset).nil? mime_type.to_s else "#{mime_type}; charset=#{c}" end end # Returns the response's content MIME type, or nil if content type has been set. def content_type content_type = String(headers["Content-Type"] || headers["type"]).split(";")[0] content_type.blank? ? nil : content_type end # Set the charset of the Content-Type header. Set to nil to remove it. # If no content type is set, it defaults to HTML. def charset=(charset) headers["Content-Type"] = if charset "#{content_type || Mime::HTML}; charset=#{charset}" else content_type || Mime::HTML.to_s end end def charset charset = String(headers["Content-Type"] || headers["type"]).split(";")[1] charset.blank? ? nil : charset.strip.split("=")[1] end
As you can see, the Response object was working directly against the
Content-Type header, and parsing out the part of the header as needed. This was especially problematic because as part of preparing the response to be sent back to the client, the Response did additional work on the headers:def assign_default_content_type_and_charset! self.content_type ||= Mime::HTML self.charset ||= default_charset unless sending_file? end
So before sending the response, Rails was once again splitting the
Content-Type header over semicolon, and then doing some more String work to put it back together again. And of course, Response#content_type= was used in other parts of Rails, so that it was correctly set based on the template type or via respond_to blocks.This was not costing hundreds of milliseconds per request, but in applications that are extremely cache-heavy, the overhead cost could be larger than the cost of pulling something out of cache and returning it to the client.
The solution in this case was to store the content type and charset in instance variables in the response, and merge them in a quick, simple operation when preparing the response.
attr_accessor :charset, :content_type def assign_default_content_type_and_charset! return if headers[CONTENT_TYPE].present? @content_type ||= Mime::HTML @charset ||= self.class.default_charset type = @content_type.to_s.dup type < < "; charset=#{@charset}" unless @sending_file headers[CONTENT_TYPE] = type end
So now, we’re just looking up instance variables and creating a single String. A number of changes along these lines got overhead down from about 400usec to 100usec. Again, not a huge amount of time, but it could really add up in performance-sensitive applications.
Render Collections of Partials
Rendering collections of partials presented another good opportunity for optimization. And this time, the improvement ranked in milliseconds not microseconds!
First, here was the Rails 2.3 implementation:
def render_partial_collection(options = {}) #:nodoc: return nil if options[:collection].blank? partial = options[:partial] spacer = options[:spacer_template] ? render(:partial => options[:spacer_template]) : '' local_assigns = options[:locals] ? options[:locals].clone : {} as = options[:as] index = 0 options[:collection].map do |object| _partial_path ||= partial || ActionController::RecordIdentifier.partial_path(object, controller.class.controller_path) template = _pick_partial_template(_partial_path) local_assigns[template.counter_name] = index result = template.render_partial(self, object, local_assigns.dup, as) index += 1 result end.join(spacer).html_safe! end
The important part here is what happened inside the loop, which could occur hundreds of times in a large collection of partials. Here, Merb had a higher performance implementation which we were able to bring over to Rails. This is the Merb implementation.
with = [opts.delete(:with)].flatten as = (opts.delete(:as) || template.match(%r[(?:.*/)?_([^\./]*)])[1]).to_sym # Ensure that as is in the locals hash even if it isn't passed in here # so that it's included in the preamble. locals = opts.merge(:collection_index => -1, :collection_size => with.size, as => opts[as]) template_method, template_location = _template_for( template, opts.delete(:format) || content_type, kontroller, template_path, locals.keys) # this handles an edge-case where the name of the partial is _foo.* and your opts # have :foo as a key. named_local = opts.key?(as) sent_template = with.map do |temp| locals[as] = temp unless named_local if template_method && self.respond_to?(template_method) locals[:collection_index] += 1 send(template_method, locals) else raise TemplateNotFound, "Could not find template at #{template_location}.*" end end.join sent_template
Now this wasn't perfect by a long shot. There was a lot going on here (and I'd personally like to have seen the method refactored). But the interesting part is what happened inside the loop (starting from
sent_template = with.map). Unlike ActionView, which figured out the name of the template, got the template object, got the counter name, and so on, Merb limited the activity inside the loop to setting a couple of Hash values and calling a method.For a collection of 100 partials, this could be the difference between overhead of around 10ms and overhead of around 3ms. For a collection of small partials, this could be significant (and a reason to inline partials that were appropriate to be partials in the first place).
In Rails 3, we've improved performance by reducing what happens inside the loop. Unfortunately,there was a specific feature of Rails that made it a bit harder to optimize this generically. Specifically, you could render a partial with a heterogenous collection (a collection containing Post, Article and Page objects, for instance) and Rails would render the correct template for each object (Article objects render
_article.html.erb, etc.). This means that it was not always possible to determine the template to render up front.In order to deal with this problem, we haven't been able to optimize the heterogenous case completely, but we have made
render :partial => "name", :collection => @array faster. In order to achieve this, we split the code paths, with a fast path for when we knew the template, and a slow path for where it had to be determined based on the object.So now, here's what rendering a collection looks like, when we know the template:
def collection_with_template(template = @template) segments, locals, as = [], @locals, @options[:as] || template.variable_name counter_name = template.counter_name locals[counter_name] = -1 @collection.each do |object| locals[counter_name] += 1 locals[as] = object segments < < template.render(@view, locals) end @template = template segments end
Importantly, the loop is now tiny (even simpler than what happened in Merb inside the loop). Something else worth mentioning is that in improving the performance of this code, we created a PartialRenderer object to track state. Even though you might expect that creating a new object would be expensive, it turns out that object allocations are relatively cheap in Ruby, and objects can provide opportunities for caching that are more difficult in procedural code.
For those of you want to see the improvements in pictures, here are a few things to look at: first, we have the improvement between Rails 2.3 and Rails 3 edge on Ruby 1.9 (smaller is faster).

And here it is for more expensive operations:

Last we've got a comparison of Rails 3 across four implementations (Ruby 1.8, Ruby 1.9, Rubinius, and JRuby):

You can see that Rails 3 is significantly faster than Rails 2.3 across the board, and that all implementations (including Rubinius!) are significantly improved over Ruby 1.8. All in all, a great year for Ruby!
Next post, I'll talk about improvements in the Rails 3 API for plugin authors—keep an eye out, and as always, leave your comments!
Popularity: 32% |

Watch a Live Demo of Engine Yard AppCloud
The Engine Yard Newsletter
Error in the third code block. There are <'s where there should be <
Thanks for pointing it out. Seems to be a bug with our syntax highlighting. I'll see if I can get it taken care of :/
They're some big improvements in Rails3. Nice work, and thanks for taking the time to post.
Re the performance cross-section across Ruby versions – is your Rails1.8 just a distro package?
Er.. *Ruby1.8
I do not get, what's the point? After one year of work Rails3 reach Merb1 speed? What a lost of resources… If RoR3 was built upon Merb1 instead of RoR2, we could have RoR3 done already.
Your logic is flawed in my perspective.
First of all, what is "RoR3 done" in your opinion?
You're assuming that because they're optimizing Rails you instead could have taken the path where you built around "merb-core" and implemented the "Rails" functionality? I can't imagine this being any more productive than taking the "merb idealogy" and then optimizing Rails where needed.
More than not Merb and Rails were becoming competitors and since the community is already at critical mass around Rails both teams made the decision it makes sense to transfer the best practices of Merb into Rails. That means digging into rails and optimizing where it seems fit. This is more practical and a natural cycle in software-development (optimize, refactor)
If you went the other way around and built on top of Merb Core, in my assumption that would require a lot more raw grunt-work as you're not going to simply "port rails" over into Merb. Everything about Rails would probably be looked at and rewritten to be optimized which would take much longer than a year to get done.
By diving into Rails and optimizing with the Merb ideology you get improvements over time without an insanely long wait for essentially an entire rewrite of a massive library.
By RoR3 being "done" I meant the "final release". What is RoR3 status now? Pre-alpha? Alpha? Beta? How far are we to final release? Is there any timeline established? Another year, two, three? No wonder, many Rails programmers I know, are moving to Sinatra or Lift (Scala).
I know it is nothing new, but why, for God's sake, it was decided to transfer "best practices of Merb2 into Rails2"? Everybody knows RoR2 was a mess inside a year ago. And Merb1 was already modular and fast. And this is my point (and I am not the only one): it could be much faster to start with Merb towards RoR3 than starting from RoR2 towards RoR3. Merb1, had much more features we expect to see in RoR, mainly: speed & modularity.
And as far as I remember, RoR3 is not be only "improved Rails2", it will be also Merb2 (at least it was said months ago)
But, now it is too late. Decisions were made. I wish RoR3 good luck, but I do not know if I find enough patience to wait another year. And Lift is very tempting alternative… It is faster than ever Ruby can be, and Scala is more powerfull language as well.
I can almost see that vein in your forehead about to burst. Calm down man. The world does not always respect your wishes. What is the point of this? Yehuda and his team have done some excellent work to better a product that thousands of people use and enjoy. That's the point. Nothing is lost. It is *not* "too late" for anything. You have contributions that you would like to make? Fork the project and have a go at it. Just please, quit your complaining.
"Your ideas, feedback and even complaints will be 100% welcome in the future, just as they have been in the past." (Yehuda Katz, 23 xii 2008, http://tinyurl.com/9gsw65)
What's your problem, Edvin? You are not used to listen different opinion, don't you?
You are right my friend. I should not have asked you to refrain from posting your thoughts. It is your right to voice your opinions. I just reacted to the feeling of futility in your post. Honestly though, I'm sure you can help make Merb/Rails 3 a product to be proud of by contributing your effort rather than your general condemnation.
Merb was still relatively new as well. While it might have had a better codebase and performance at the time, there were far more Ruby on Rails applications out there, compared to Merb ones. By expanding upon Rails 3, they made sure that the upgrade path for the majority of the users would be easy. Had they expanded upon Merb instead, many Rails applications would have had to be slowly migrated over, and many plugins/gems would no longer have worked (which many might have been resistant to do given the effort and cost involved, negating the effort put into improving it). And yes, while the same app/plugin/gem breakages are likely true for Rails 2 -> Rails 3 migration, it'll be far easier to fix the issues you find on a framework you're familiar with, that something you've probably never used before.
I remember that it was promised that migration from Merb1 to Merb2/RoR3 would be as simple as possible and similiar to migration from FoR2 to RoR3. Something was changed??
Gems? What gems do you mean? Besides few of them they are mostly for Ruby, not for particular web framework.
Plugins? There was always problem with them in RoR2. I do not even count how many times I could not upgrade a plugin for never RoR version. And if I am correct, Merb2/RoR3 will use Merb philosophy over here, so there would be no more place for old plugins system at all. Merb has no plugins, only gems. And I suppose, Merb2/RoR3 will also use gems instead of plugins. Simpler and easier to manage after all.
Re migration: I was refering to the update from Rails 2 -> Rails 3 would be eaiser than if they went with Merb as the base. I'm not sure about the state of Merb 1 -> Rails 3 migration.
Re gems: There aren't a lot I know of, but there are some. I was mainly thinking of ones like New Relic gem. If Merb were the base, I'm guessing they'd have had to rewrite their Rails adapter.
Re plugins: Right, I'm not saying plugins won't break between Rails 2 and Rails 3. But, until the plugin is written to use the new api, the code to make things work again should be easier to patch in than it would be to rewrite entirely (or wait for it to be rewritten) for Merb.
Actually, after one year, we still know very little about RoR3. Is there any list of all features and enhancements we could except in RoR3? Is there timeline for alpha, beta, rc and final release? Is there any timeline at all? Will RoR3 keep old plugins system (with only improved API) or it will move to pure gems instead (like in Merb)? Will old promise (that migration from Merb1 to Merb2/RoR3 will be as simple as from RoR2) be kept?
Check one of doezns of Yehuda's presentation about Rails 3. This is pretty good source of knowledge. You can also check documentation, specs and code in current github version of rails.
I've started writing application in Rails 3 and it basically works. I've managed to rewrite some gems using new Rails 3 modules (ActiveSupport::Callbacks, ActiveModel::Validations and so on) and it's well structured easy to understand code.
If you feel that you know very little, at least *try* to find any information. Merb was also not so well documented and you had to figure out many things from code.
Several of the developers and a few of the community have been making blog posts or tweets about whats been going into Rails 3 for some time now. If you're following the right people, you'll know that the bulk of features are implemented. In no way is the following list exhaustive of what is already implemented in Rails 3:
* Increasing Performance and reducing overhead (this post)
* Separation of responsibility and Dependency declaration (previous post)
* ARel integration with ActiveRecord (no blog posts I know of, but a lot of commits for it recently)
* Rails router DSL improved – http://rizwanreza.com/2009/12/20/revamped-routes-... and http://yehudakatz.com/2009/12/26/the-rails-3-rout...
* Gem bundler (allowing references to Rubygems to be removed) – http://yehudakatz.com/2009/11/03/using-the-new-ge...
* Major cleanup / refactoring of internals (particularly around dispatching/respond_to)
* many many more….
As for timelines, while there isn't a specific one available, mainly because with project like this, it is very hard to accuratly predict them, DHH has estimated a beta release by end of January: http://twitter.com/dhh/statuses/7208225785
From what I have seen regarding plugins, Rails 3 will maintain backward compatability, but includes new features such as off loading the rake task loading to the plugin, and adding support for initializers within the plugin. As an example of this, ActiveRecord is now treated as a plugin of Railties (which should make swapping out the ORM a lot less painful). Example: http://github.com/rails/rails/blob/master/activer... . These added abilities should make development via gems a lot easier.
Great to see that it performs so well on JRuby. Hope that it can help drive adoption for a JVM-based Ruby implementation.
What if you partitioned the @collection by type such that you could determine the template to render with up front (for each segment)? You could loop over the segments and, for each type, run your existing loop to call render(). This would presumably incur an up-front O(n) penalty for the partition() call, but I imagine that being faster than doing the type-to-template conversion for every element in the collection.
I assume you're using a standard Ruby 1.8.7 installation? It would be nice to see how these optimization compare between that and an alternative implementation of the same version, namely Ruby Enterprise Edition version of Ruby 1.8.7. Would you be able to get this information?
Yehuda, it's laudable for the digging into Rails depths and weed these out. You and your team not only merging Rails and Merb, but brings us to a new stage of development and forcing the fuss.
Thanks!
Yehuda, I'm currently switching from Java to Ruby on Rails and from what I can see, this are times of change, I don't want to get too much into rails 2.3 since its going to drastically change over the next period, what do you suggest for someone that's just arrived? I devoured the Ruby 1.9 book, pretty comfortable with it now, should I wait for RoR3? When is it going out in a stable version?
i'm no yehuda… but rails is always undergoing change, big or small… rails 1.x.x -> rails 2.x.x -> rails 3.x.x… you should just dive in, rails developers aren't scared of change. we embrace change :)
These are valid questions. If we have to wait it would be good to know at least RoR3 TIMELINE. Do anybody know anything about it? Me, and many my friends are in stock now. RoR2 is not modular. RoR3 is not ready for serious usage. Merb has dead homesite and docs. If you want to stick with Ruby maybe Sinara would be a choice? But, on the other side, Simatra is not rich in generators, helpers etc… I mentioned Lift, but it also needs the switch from Ruby to Scala (more powerfull but also more complex language). Django? No, it sounds like a heresy. ;)
Jarosław, Rodrigo:
As it has been said many times before, Rails 3 API will be almost entirely backwards compatible. There are some changes in public API (like router changes and new responds format), but huge amount of work was done *inside* Rails.
It implies to things:
1. You don't have to wait for Rails 3, most of the knowledge learned for rails 2 will be still valid.
2. Almost all the plugins will be broken, so if you start with Rails 3 try to not use big plugins.
So the short answer:
If you are newbie start with rails 2.
Jaroslaw, DHH has estimated a beta release by end of January: http://twitter.com/dhh/statuses/7208225785 It may change, but at least it gives some indication of how close Rails 3 is
There is nothing stopping you from using it right now though. Rails master has been stable for some time now. All tests are passing. I've been able to generate and run an application off Rails 3 for some time now.
So rather than saying "RoR3 is not ready for serious usage", say exactly why it isn't ready so developers might be able to address the issues. Statements like that without explaination don't help anyone.
Object allocation may be cheap, but GC actually does take its toll. Any numbers on memory churn comparing Rails 2 and 3?
yehuda – why not simply
template_for = Hash.new{|hash, object| hash.update object.class => compute_template_for(object.class)}
…
@collection.each{|object| template_for[object]….
that way you don't have to do any up front template selection and worst case == best case == one lookup
I second Rodrigo Dellacqua's question.
When will we see a stable release for RoR3?
Rails 3 works well on simple scaffold blog application with ruby 1.9.2 for me, except it works slower than Rails 2 :(
Try this:
1. generate scaffold title:string body:text (sqlite3)
2. run server in production mode
3. ab -n 1000 http://localhost:3000/posts (on 20 records)
I've got about 50 req/s for Rails 2 and only 35 req/s for the same page, which is strange after all that buzz about performance optimizations in Rails 3. It seems all improvements was ruby 1.8.7 only (haven't checked if Rails 3 is really faster on 1.8.7 though).
Also in development environment performance degradation comparing to Rails 2 is even worse :(
Anybody have similar results?
Our app *decreased* in performance by 25% after migrating from Rails 2.3.5 to Rails 3.0.0.beta3. This is a similar reduction to the above, and measured with similar "ab" performance testing methodology.
We're not on Ruby 1.9.2–we're on Ruby 1.8.7 (actually, REE).