Since December, Rails has undergone a fairly significant internal refactoring in quite a number of areas. While it was quite tricky at first, we mere mortals have started to hone a process for diving into a new area of the codebase and emerging some time later with a much improved area that does basically the same thing. Here’s the approach we’ve adopted and advocate:
First, refactoring needs to be refactoring, not revision. By that I mean that while you are in the process of invasively improving some code, it is not the appropriate time to also change the functionality of that code. If you do both at the same time, it will be difficult to track down whether a bug in the code is the result of refactoring or functionality changes.
We’ve held fast to this requirement for the Rails 3 work Carl and I have been doing, which has resulted in an extremely stable edge, despite making fairly invasive changes.
Second, any kind of significant refactoring without tests is folly. The first thing you should do is take a look at the test suite for the area in question and beef it up if necessary.
Thankfully, Rails has a fairly reasonable test suite, and the addition of Sam Ruby’s Agile Web Development on Rails test suite has provided an additional level of confidence in the changes we’re making.
Third, once you’re ready to dive in, read through the code carefully. It can be tempting to just go in and hack away at a particularly egregious part of the codebase, but you’ll frequently be changing code that exists for a reason.
Something I’ve noticed both in Rails applications and in Rails itself is that code that looks very strange at the beginning of a period of refactoring tends to exist for a reason.
Fourth, as you proceed, make very small changes, then run the full test suite after every change. Commit often. What you want to look for is cases where the boundary APIs around the code you’re writing are messy (so you have multiple ways in to a particular class or area of code where one would suffice).
One Rails example would be rendering a template in ActionView from ActionController. When I started in December, ActionController called into ActionView using a number of public and private APIs, so making any changes around those boundaries was very tricky. Some things we wanted to do, like improve the way layouts were selected, was too complex because of the number of ways templates and layouts were rendered.
The very first thing I did in the early days of the merge was work toward reducing the number of ways that ActionController told ActionView to render a template. In the end, we settled on just a single API: render_template_from_controller, which takes a Template object from the template to render, and a Template object for the layout. Once this was done, it became a lot easier to make changes on either side of the boundary, without fear that a small change in ActionView could break any number of things in ActionController.
Of course, this assumes that you understand what your boundaries are. This is something that’s learned over time, but a fundamental requirement in good refactoring is having functionality broken up into units that are easy to understand, with small surface area. This is commonly achieved using classes, which is a good starting point, but Ruby has other tricks up its sleeve as you get more advanced, like judicious use of modules (and the new Rails ActiveSupport::Concern).
Fifth, once you have reasonable boundaries, dive in and start making changes. A pretty good rule of thumb is to clean up cases where a public API has started being used for private, internal use. This might mean that changing the internals of your code breaks the public functionality (which, again, should be sacrosanct during this process). Have a zero-tolerance policy for failing tests as you make small changes, especially as you separate out public and private functionality.
One example of this in Rails was extensive usage of ActionView’s public render method by private functionality. As a result, the public render method had snippets of code inside to handle special cases (like render :file taking a Template object). The solution in this case was to extract out the private functionality, and have the public render method as well as the private internals call the new extracted methods. This ensures that internal functionality is kept internally, where it can be refactored more easily.
Sixth, don’t be afraid to git reset --hard if you find yourself sinking into quicksand, with rising confusion due to changes you made. Over the course of working with Rails, I’ve lost an hour or more at a time to changes made too rapidly and carelessly, and the only advice I can give is to give up on ratholes as early as you notice them.
So that’s it. Six easy steps to refactoring Rails.



Great article, Yehuda. Some points are fairly obvious, but it never hurts to both re-iterate und re-read them. Thanks!
Something I could have been more clear about: If you don't have adequate tests, don't refactor a single line before you do. Period.
Awesome advice, Yehuda. I've definitely found that on they trickiest part step 6 is key, even repeatedly. If it's a really tricky corner of the refactor running through it a couple times gets me a cleaner understanding, and each time my approach improves, until I finally come up with one that doesn't suck.
Great work!
While reading I had a silent thought: "These guys repaired rails."
Thanks very much!
Thanks for writing this great article and thanks for your work on improving rails.
From the article: A pretty good rule of thumb is to clean up cases where a public API has started being used for private, internal use.
If I understand, the suggestion here is not to use the public API anywhere in the internals. This is slightly different than the rule of thumb: don't use a private method as if it were a public API method. Can you elaborate on the benefits of not using a public API method in the internals as opposed to not using a method that is intended to be private as a public API method? Thanks!
@jsd what I've seen time and time again is that innocent use of a public API in a private context eventually made the public interface far more complex than it needed to be. More importantly, it was not quite difficult to determine whether some functionality in a public method was meant to be public, or meant to be internal.
Just to be clear, when I say "public" with regard to Rails, I mean functionality intended to be used by users of the library or framework, not the "public" visibility in Ruby.
Indeed, this article could also be titled "an extremely abridged version of 'Refactoring' by Fowler, Beck, et. al". :P
"We’ve held fast to this requirement for the Rails 3 work Carl and I have been doing, which has resulted in an extremely stable edge, despite making fairly invasive changes."
Did you try both ways? If not, how can you derive this causation?
I've done very invasive changes, while adding features, without even any automated tests, and maintained stable code. It was (as everything else in programming) a matter of being careful, not any particular strategy or methodology.
@alan as a matter of fact, I have tried it both ways. It might be possible for a single, very disciplined, excellent programmer to keep a relatively stable codebase without any tests, but even for such a developer, good tests add value.
The rule you're commenting on actually came directly out of work we did on Merb, as well as an observation about Ruby 1.9. In the case of Merb, our early tests (the 0.3 to 0.5 era) were fairly coupled, and we routinely made refactoring and new-feature changes at the same time. I talked some about the coupling problem at RubyConf last year.
Ruby 1.9 has a similar problem. Because the internals were changed at the same time as language features were added, it's very difficult to determine whether a breaking change is intentional or not. Had they focused on swapping in a new VM (like JRuby and Rubinius did), it's trivial to point at a breaking change and know it's accidental.
The work we're doing on Rails is really just the culmination of a lot of experiences trying different refactoring strategies. Especially when working with members of a team and the open-source community, the quality of the recommendation that refactoring is a separate phase from new features has significant communication benefits that make it worthwhile all on its own.
Step 7: merge yourself with another master developer to form a superpowered human being named Carlhuda and leave a fire trail on your keyboard while hacking
That last point I can definitely vouch for — 'git reset –hard' has come in spades for a parsing project I am working on.
Any of this point not new for me. But anyway, thanks for systematization.
What happened to your RSS feeds?
yomero — we inavertently broke some safari feeds when we migrated to wordpress a few months ago. if you resubscribe to our feedburner feed you'll be cool as a cucumber