<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Engine Yard Blog &#187; Brian Ford</title>
	<atom:link href="http://www.engineyard.com/blog/author/brianford/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.engineyard.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 07 Feb 2012 01:49:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Rubinius wants to help YOU make Ruby better</title>
		<link>http://www.engineyard.com/blog/2010/rubinius-wants-to-help-you-make-ruby-better/</link>
		<comments>http://www.engineyard.com/blog/2010/rubinius-wants-to-help-you-make-ruby-better/#comments</comments>
		<pubDate>Mon, 30 Aug 2010 16:48:12 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=4063</guid>
		<description><![CDATA[<p>It is a great time to be a Rubyist. This year we have already seen IronRuby 1.0, JRuby 1.5, with Ruby 1.9 due to be released shortly. Ruby is simply becoming better and faster on every platform. And, wherever Ruby is, Rails is sure to be nearby. Rails 3 looks more awesome each day.</p>
<p>Recently, our very own <a href="http://rubini.us">Rubinius</a> officially joined the ranks with a 1.0 release. We are excited to see folks trying it out. All the feedback and issues reported have been a great help. Many people are reporting that their apps "just work".</p>
<p>With all this great news, the Ruby world looks rosy indeed. However, we can make Ruby even better. To do so, we need your help. You may not realize this, but the quality of the Ruby code you write can have a significant impact on how great we can make Ruby. I'd like to share some tips about how you can improve your Ruby code while helping us make Ruby better too.</p>
<h2>0. Rubinius</h2>
<p>Rubinius is a completely new implementation of Ruby. When <a href="http://blog.fallingsnow.net/">Evan Phoenix</a> started Rubinius, he put some stakes in the sand. Rubinius has a modern, bytecode virtual machine, a cutting-edge garbage collector, a just-in-time (JIT) compiler utilizing the awesome <a href="http://llvm.org">LLVM</a> project, and a Ruby core library and bytecode compiler written in Ruby. We are only just getting started with 1.0. We have a whole list of features coming, including support for Windows and Ruby version 1.9, as well as improvements to the JIT compiler that should make Ruby several times faster, and removal of the global interpreter lock (GIL) so that your threads will execute Ruby code concurrently.</p>
<p>Rubinius does a lot of things differently than MRI under the covers. As Rubinius has grown up, we've definitely seen a wide cross-section of Ruby code while working on features and compatibility. The tips for writing better Ruby code below are based on some of the challenges we have faced.</p>
<h2>1. Sending Messages</h2>
<p>Rubinius is unique among the various Ruby implementations in that it implements the Ruby core library primarily in Ruby. Even the primitive methods, operations implemented in C++ that must access the virtual machine directly, appear to other Ruby code as normal Ruby methods. Importantly, calling these primitive methods from Ruby code is like calling any other Ruby method.</p>
<p>Early on in the Rubinius project, a lot of attention was focused on the idea of <em>Ruby in Ruby</em>. This was a good idea for several reasons, one of which being that Ruby is a more elegant and expressive language than C or Java, and that Ruby programmers tend to understand Ruby code pretty well. This familiarity with Ruby makes Rubinius easier to develop and maintain, and more approachable for many Ruby developers. The validity of these reasons has been demonstrated in the life of the project. However, there are two other very important reasons that don't attract quite as much attention.</p>
<p>The first of these is performance. As Evan often points out, Ruby is the currency of the Rubinius VM. It understands Ruby inside and out. The VM knows how to find a Ruby method, how to look up a constant, and what it means for an object to reference another object. The Rubinius VM operates on a special representation of Ruby code. This representation is often referred to as <em>bytecode</em> and is essentially a stream of instructions for the virtual machine. The JIT compiler, which can significantly improve Ruby performance, also operates on bytecode. What this means is that to the JIT, your program and the Ruby core library look an awful lot alike. So much, in fact, that the JIT compiler can mix them all together, which gives the optimizer much greater opportunity to generate <em>really</em> fast code.</p>
<p>The second reason is the consistency and elegance of an object-oriented language. When the Ruby core library is written in Ruby, you call a Ruby method, well, by calling a Ruby method. That may sound redundant, but I assure you, it is not. In MRI, for example, with the Ruby core library written in C, the code will often call directly to a C function rather than dispatching normally through Ruby method calls. What this means for you is that MRI may invoke "Ruby" functionality without engaging you in the conversation at all. That inconsistency may prevent you from using simple and elegant object-oriented code that extends the functionality of core classes.</p>
<p>In contrast, when functionality is invoked through normal Ruby dispatch, your code can be elegant and participate in the process. However, this is a significant double-edged sword, as we have become painfully aware of in Rubinius. When we implement all the complex behavior of the core library in Ruby, it's quite possible to do something crazy, like remove all the Ruby methods we need to make an object work! That is pretty crazy, right? Fortunately, in this coding wild west, there is a very important principle that can lend some law and order.</p>
<h2>2. Liskov Substitution Principle</h2>
<p>You may have heard this term tossed around in discussions. If you haven't, don't worry, we'll delve into this fairly intuitive idea. If you have, I hope to renew your commitment and respect for this principle.</p>
<p>So, what are we talking about here? Barbara Liskov and her collaborators were concerned with how to write reliable object-oriented software. As you know, one of the principle ideas in class-based object-oriented languages is inheritance, or the relationship between a class and its subclasses. What sort of rules should govern this relationship? What should we expect when we use a subtype in place of a supertype in our program? These are the questions that Barbara Liskov and others were pondering.</p>
<p>What they proposed is referred to as the <em>Subtype Requirement</em>, which they defined as:</p>
<blockquote><p><em>Let q(x) be a property provable about objects x of type T. Then q(y) should be true for objects y of type S where S is a subtype of T.</em></p></blockquote>
<p>(see <em>Behavioral Subtyping Using Invariants and Constraints</em>, by Barbara H. Liskov and Jeannette M. Wing.)</p>
<p>Let's consider this in terms of some Ruby code. Suppose you have this class in your program:</p>
<pre escaped="true">  class FancyArray &lt; Array
    def initialize(size)
       # ...
    end
  end</pre>
<p>What is wrong with this picture? Well, in my Ruby code, I can do <code>x = Array.new</code>.  But what happens when I attempt to use the FancyArray class in place of Array? If I do <code>x = FancyArray.new</code>, I will surely get an ArgumentError exception because FancyArray requires that I pass one argument when calling the <em>new</em> method.</p>
<p>Let's phrase this in terms of the <em>Subtype Requirement</em>: Let <em>x</em> be an instance of Array. Then q(x) = <em>the arity of the initialize method is -1</em>. Let <em>y</em> be an instance of FancyArray, which is a subclass of Array. Then q(y) = <em>arity of the initialize method is -1</em> by the <em>Subtype Requirement</em>.</p>
<p>Now let's relate the above to Ruby code and check if the <em>Subtype Requirement</em> holds:</p>
<pre escaped="true">irb(main):001:0&gt; x = Array.instance_method(:initialize).arity
=&gt; -1
irb(main):002:0&gt; y = FancyArray.instance_method(:initialize).arity
=&gt; 1
irb(main):003:0&gt; x == y
=&gt; false</pre>
<p>It is clear from this that FancyArray does not conform to the <em>Subtype Requirement</em>. Consequently, code that expects to use an Array will not function correctly when a FancyArray is substituted. It's important to also note that the <em>Subtype Requirement</em> applies to any observable property of the object. The example used in the paper is of a Stack and Queue. Both classes may provide <em>push</em> and <em>pop</em> methods, but the semantics of the methods are quite different between the two classes.</p>
<p>Now, you may say, "But, I have a very good reason for requiring an argument to <em>new</em>." Well then, I would venture to say you have an important reason to consider the difference between composition and inheritance for designing your program.</p>
<h2>3. Composition versus Inheritance</h2>
<p>Of the three object-oriented principles—inheritance, encapsulation, and polymorphism—inheritance has been so abused there could be a 12-step program devoted entirely to it. Fortunately, the remedy for inappropriate use of inheritance is quite simple: compose your objects of other objects.</p>
<p>Inheritance models an <em><strong>is a</strong></em> relationship, while composition models a <em><strong>has a</strong></em> relationship. If your object is a String, then it will do all the normal String things <em>just as a String would do them</em>. This is very important. It needs to do <em>String things</em> not just externally, when you call the methods, but internally, when the other String methods call each other. Is your FancyTemplate class really a String? Then, for example, I should always be able to request its length. However, your FancyTemplate instance probably doesn't have a length when it is being built. Therefore, String methods that may be employed during the construction phase could be highly confused. In such case, I suggest your FancyTemplate <strong>has a</strong> String internally, and it can be urged to give you a representation of that String at some point in time. Yet, it is not a String from the perspective of inheritance and conforming to the <em>Liskov Substitution Principle</em>.</p>
<p>Only you can tell whether your model is best represented by inheritance or composition. When designing your classes, be sure to consider the view from inside and out. If you are contorting your methods to act like the class you are inheriting from, perhaps your class only <em>has one</em> of those things, rather than <em>being one</em> of them. Most importantly, remember that you are not the only kid on the playground.</p>
<h2>4. Playing Nicely</h2>
<p>This is more about general advice than specific admonitions. We are lucky to have such a powerful, expressive language in Ruby. Opening a core class to patch a method is tremendously useful and powerful. However, remember that with great power, comes great responsibility.</p>
<p>First and foremost, simply be conscious of what you are asking Ruby to do for you. I used this example earlier, and I'm going to repeat it because in Rubinius we have encountered this more times that we can count. Ruby is an object-oriented language. You cause computation to occur by sending messages to an object. <em><strong>How can the object work if it has no methods?</strong></em> (I say with my best Zoolander impersonation). If your code does:</p>
<pre escaped="true">  class SomeClass
    instance_methods(false).each { |m| undef_method m }
  end</pre>
<p><em>you are (most likely) doing it wrong</em>. There are many variations on this theme, but they all share the same problem: the assumption that those methods you are removing are as superfluous as Johnny's appendix. I assure you, we don't randomly add methods to classes in Rubinius. Again, your code may work fine in MRI when you do this because MRI calls C functions on that object behind your back with impunity. But, we do want to have nice things, right? If you ever wonder what consequences your code may have, just drop into the #rubinius channel on freenode. We will happily discuss it with you.</p>
<p>A related problem occurs when code inherits from a core Ruby class and redefines one of the core methods. When the core classes are implemented in Ruby, the methods may depend on one another to perform their tasks. For example, in Hash it would not be entirely unreasonable for <em>each_value</em> to be implemented in terms of <em>each</em>. Well, not unreasonable, that is, until you try to run REXML in the Ruby Standard Library. REXML has an Attributes class that inherits from Hash. The Attributes class then implements an <em>each_attribute</em> method.  For good measure, it overrides <em>each</em> to use <em>each_attribute</em>. And <em>each_attribute</em> calls <em>each_value</em>. <em>Waiter, I believe there's a StackError in my Attributes</em>. The moral of the story: the two edges on this wonderful Ruby sword are sharp. It does take extra work to consider how methods on a particular class interact with one another; to some extent, this is an implementation detail. However, it's something to be aware of when you write code. Of course, you can always browse the Ruby implementation of the core classes in Rubinius.</p>
<p>Playing nicely is more than being conscientious about how you write your own code. It's also important to consider how you use code others have written. Your code should not depend on implementation details of the classes and libraries you use. However, it's often hard to know what those implementation details are. Often the dependency will be subtle and implicit. Your code will appear to work fine in MRI but break in one of the alternative implementations. There is no general solution to this problem, but you can usually avoid it by checking the assumptions your code makes about the other code it uses. One example of this is mutating a collection in the block passed to an iterating method. Consider the following code:</p>
<pre escaped="true">some_hash.each { |key, value| some_hash.delete(key) if fancy_test(value) }</pre>
<p>Hash is a fairly complex data structure and this bit of code can have very different behavior depending on how Hash is implemented. Thankfully, Matz has explicitly said <a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/23633">this behavior is undefined</a>.</p>
<h2>5. Neighborly C Extensions</h2>
<p>While playing nicely in Ruby code is important, it's also very important when writing C extensions. These are programs typically written in C/C++ that directly access the C functions that MRI uses to implement Ruby.  You probably regularly use one or more gems or libraries that are partially implemented by a C extension. C extensions are often used to access native libraries from Ruby, for example, when writing database adapters.</p>
<p>C extensions are not the only way to access native libraries from Ruby. There are also the FFI and DL libraries. Rubinius was the first implementation to popularize the use of the foreign-function interface (FFI) library for accessing native code. In fact, vital pieces of the core library in Rubinius are implemented via FFI, which is a modern implementation of DL, the dynamic load library that MRI has included for years. There are now quality implementations of FFI available on both JRuby and MRI.</p>
<p>FFI is generally the preferred way to interface with native libraries. The benefits include not needing a C compiler and being able to harness the speed or power of a native library while writing pure Ruby code. However, there are still two core use cases for C extensions: 1) when the data marshaling through the FFI layer imposes too large a performance cost; or 2) when your code already relies on an existing C extension. These use cases are hard to get around. Fortunately, we have put a lot of effort into getting C extensions working quite well on Rubinius. In fact, many C extensions just work.</p>
<p>However, there is one particular problem with some C extensions that limits our ability to support them: some have explicit dependencies on MRI data structures, for example, RHash. Depending on a data structure your code does not control makes your program vulnerable to breaking if the other code changes its implementation. Unfortunately, the C programming language doesn't do much to enforce good practices here. If the C compiler can see a structure or function in a header file, you are free to use it in your program. Yet, just because you can, does not mean you should. Instead, you should always use a function interface (also known as an API) to access the data. Treat data structures that are not your own as opaque.</p>
<p>Of course, that is the ideal world. MRI cannot foretell every use case that a C extension may have. So some of these problems are simply the result of people being more creative than the MRI developers imagined, which is mostly a good thing. In version 1.9, MRI is enforcing the use of API's over raw struct access. For example, rather than using <code>RSTRING(obj)-&gt;ptr</code>, your code should do <code>RSTRING_PTR(obj)</code> instead. Since Rubinius is compatible with MRI version 1.8.7, we still support both forms in this case. However, to make your code robust and portable, you should use the RSTRING_PTR API.</p>
<p>One thing Rubinius does not support is code like <code>RHASH(obj)-&gt;tbl</code> that accesses the RHash struct directly. This is partially because, in Rubinius, Hash is implemented entirely in Ruby. However, most C extension code needs to do something like iterate over the entries rather than just access the structure. In this case, the <em>rb_hash_foreach</em> function is available, so it's quite easy to change a C extension so it will run on Rubinius. In fact, a number of C extensions have already been updated in this way. If you encounter a problem with a C extension, please file an issue for it.</p>
<p>We understand there are valid use cases for writing C extensions. While Rubinius is implemented very differently than MRI, we want your C extensions to be able to run in Rubinius and we have worked hard to ensure that most C extensions do run. If you encounter cases where there is no function API to work with MRI data, let us know. We can collaborate with Matz and the MRI developers to add such APIs. That way, you can help us help you to make Ruby better for everyone. Win!</p>
<p>Ruby is a terrific language and with your help, it can be even better. Do you have any tips for writing better Ruby code? Please, let us know.</p>
<p>If you are new to Rubinius, you may find these previous posts informative:</p>
<ul>
<li><a href="http://www.engineyard.com/blog/2010/making-ruby-fast-the-rubinius-jit/">Making Ruby Fast: The Rubinius JIT</a></li>
<li><a href="http://www.engineyard.com/blog/2009/improving-the-rubinius-bytecode-compiler/">Improving the Rubinius Bytecode Compiler</a></li>
<li><a href="http://www.engineyard.com/blog/2009/the-anatomy-of-a-ruby-jit-compile/">Compiling Ruby: From Text to Bytecode</a></li>
<li><a href="http://www.engineyard.com/blog/2009/5-things-youll-love-about-rubinius/">5 Things You'll Love About Rubinius</a></li>
<li><a href="http://www.engineyard.com/blog/2009/rubinius-the-book-tour/">Rubinius: The Book Tour</a></li>
</ul>
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/rubinius-wants-to-help-you-make-ruby-better/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Improving the Rubinius Bytecode Compiler</title>
		<link>http://www.engineyard.com/blog/2009/improving-the-rubinius-bytecode-compiler/</link>
		<comments>http://www.engineyard.com/blog/2009/improving-the-rubinius-bytecode-compiler/#comments</comments>
		<pubDate>Thu, 22 Oct 2009 17:00:37 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Compiler]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[ParseTree]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2649</guid>
		<description><![CDATA[<p>The Rubinius bytecode compiler is the gateway to all the magic that makes your Ruby code run. As you probably know, the Rubinius virtual machine is a bytecode interpreter. The Rubinius JIT compiler also processes bytecode, converting it into native machine code. Without bytecode, we'd be dead in the water. Recently, I've been working on improving the Rubinius bytecode compiler.</p>
<p>In this post, I'll explain what I've been doing and how it relates to getting Rubinius ready for 1.0. We recently released version 0.12 and we're going to be doing releases about every two weeks. If you haven't built Rubinius yet to check it out, head over to our <a href="http://rubini.us/">download page</a> and get started!</p>
<h2>Principles</h2>
<p>Before we get into code examples, let's lay the groundwork and review some general principles for writing software. We learn early in computer science that programming involves a series of trade-offs. There's always the the classic trade-off between execution time and storage space. If we write a function that saves each value it computes, it can return the saved value instead of re-computing it, but saving the value requires using memory or disk space. A function that computes millions of values would potentially need millions of storage locations.</p>
<p>While speed versus space is probably the most well-known classic trade-off, there are others. With a nod to <a href="http://bit.ly/three-pillars-of-zen">Philip Kapleau</a>, here are the three pillars of developing Rubinius: <strong>Code Quality</strong>, <strong>Compatibility</strong>, and <strong>Performance</strong>, or <strong>QCP</strong>. These interact in complex ways and there's no simple way to prioritize them. By compatibility, I mean conforming to the same behavior as MRI (or Matz's Ruby).</p>
<p>These three characteristics are interdependent. Clean, high-quality code enables working more quickly on compatibility. When code is behaving correctly, it's easier to profile and improve performance. Good performance, in turn, can make it easier to identify and fix compatibility issues. Quality code is easier to understand and work with when improving performance. These feedback loops push each of these code characteristics forward. The converse is also true. At times, it may be tempting to sacrifice quality to improve performance, but quick and dirty <em>never</em> pays off in the end.</p>
<p>To sum up, these are the goals we're pursuing: simplify and improve the bytecode compiler code, improve performance, make it simpler and easier to bootstrap Rubinius, and ultimately make it easier to fix compatibility issues running fantastic Ruby software, like your Rails applications.</p>
<h2>Parsers and Compilers</h2>
<p>There's a certain mystique that surrounds compilers. Rather than just accept the mystique, let's take a peek behind the unicorns and dragons and check out the simple gears and pistons that make it all work. In general terms, a compiler is a process for converting data from one form into another. In the specific case of Ruby, the compiler converts text in the form of Ruby syntax, into operations performed by the computer's CPU.</p>
<p>In discussing compilers, two rather distinct operations are often lumped together, namely, parsing and code generation. Parsing is the process of converting the source code into a data structure that the compiler can process to produce code that the computer, or a virtual machine, can execute, to perform computation. There are specific issues with each operation so let's look at them separately.</p>
<h2>A. The Parser</h2>
<p>Humans are the most adept and complex parsers in existence. One natural language can choke up a powerful computer, but humans typically handle one and sometimes several languages with relative ease. Not just the words or sounds of a language either, but also the intonations, facial expressions and body language that accompany even the simplest communications.</p>
<p>Compared to natural languages, programming languages are very simple, but even the simplest languages can be challenging to parse. Syntactically, Ruby is a rich and complex language. We love it for the expressive programs we can write—but parsing Ruby is <em>hard</em>. Every Ruby parser in the Ruby implementations that I know of are based on Matz's parser. From the beginning, Rubinius has used a directly imported version of Matz's parser with a few minor modifications.</p>
<h3>Detour—Bootstrapping</h3>
<p>Before continuing, we need to take a short detour. I'll be writing a post on the Rubinius bootstrapping process in the future, but I'll start by briefly describing part of the problem here.</p>
<p>The Rubinius bytecode compiler, and most of the Ruby core library (classes like <code>Array</code> and <code>Hash</code>), are written in pure Ruby—and herein lies the challenge. The Rubinius VM interprets bytecode. To run the Rubinius compiler, it needs to translate the Ruby source code for the compiler into bytecode, but to do so, Rubinius needs to run the compiler. You can probably see where this is heading: around and around without getting much done.</p>
<p>That is the essence of the problem of bootstrapping. To break this loop, we need to insert a process that does not depend on Rubinius. In other words, we need something that's <em>not</em> Rubinius to compile the core library and bytecode compiler so that Rubinius can load the bytecode and run the compiler. <em>Then</em> Rubinius can compile its own compiler and core library.</p>
<p>One way to do this would be to load the Ruby source code in MRI and use the <a href="http://bit.ly/parse-tree">ParseTree</a> gem to extract the MRI parse tree as a recursive array of symbols and values, something also known as an S-expression or sexp. But, since you can't change the MRI parser without impacting its run time behavior, <a href="http://blog.fallingsnow.net">Evan Phoenix</a> wrote the sydparse gem.</p>
<p>This gem is essentially a C extension combining the MRI parser with the sexp generation code from ParseTree. The gem enabled Evan to modify the parser if desired and get the parse results as a sexp. The sydparse gem is basically built into Rubinius currently under the <em>vm/parser</em> directory.</p>
<p>For example, try the following code:</p>
<pre>$ bin/rbx
irb(main):001:0&gt; "a = 1".to_sexp
=&gt; s(:lasgn, :a, s(:lit, 1))</pre>
<p>So, with the <code>String#to_sexp</code> method, we have something that will take Ruby code and transform it to a data structure that should be fairly easy for a computer to process. And that is precisely how the early Rubinius bytecode compiler worked: it processed sexps into bytecode. It didn't matter whether the sexps are output by sydparse running in MRI or by the Rubinius built-in parser. With that, a big part of bootstrapping was solved.<br />
<strong><span id="more-2649"></span></strong></p>
<h3>Back to the Main Road</h3>
<p>So we've got a compiler that processes sexps to bytecode and it runs in both MRI and Rubinius. Everything sounds copacetic—what's the trouble, you ask? Well, compiling Ruby is <em>hard</em>, too.</p>
<p>To make the process more tractable, Evan rewrote the compiler so that the sexps are processed into an abstract syntax tree (AST) of Ruby objects. Each object has a <code>#bytecode</code> method. To generate bytecode, start at the root of the tree and visit each node, calling the <code>bytecode</code> method for the node.</p>
<p>For perspective, let's enumerate all the stages in the compiler. I'm glossing over a few details here, but you'll see the basic picture:</p>
<ol>
<li><em>parse tree</em>: A tree of C data structs created by the MRI parser.</li>
<li> <em>sexp</em>: A recursive array of symbols and values created by processing the MRI parse tree.</li>
<li><em>rewritten sexp</em>: The form of the sexp after certain structures are normalized to make converting the sexp simpler.</li>
<li><em>AST</em>: The abstract syntax tree of Ruby objects created by processing the rewritten sexp.</li>
<li><em>bytecode</em>: A stream of instructions that the Rubinius VM can execute.</li>
<li><em>compiled method</em>: The bytecode packaged with some additional information like the names of local variables and the amount of stack space needed when the compiled method runs. A typical Ruby source code file compiles to a tree of compiled methods.</li>
<li><em>compiled file</em>: The Rubinius VM can execute the compiled methods directly. But to avoid having to recompile them, the tree of compiled methods is serialized to a compiled file on disk. Rubinius can read the compiled file to recreate the tree of compiled methods in memory.</li>
</ol>
<p>That's a significant number of stages and it's not hard to see where we can simplify to improve the process. Those sexp stages appear to just be passing data along. In fact, they require creating a lot of additional objects, time to process, and some seriously complex code to process them. The latter has a significant, negative impact on code quality.</p>
<p>S-expressions are just data, dumb and brittle. They only have form (e.g. <code>[:x, :y]</code> versus <code>[:x, [:y]]</code>) and position (e.g. <code>[:alias, :x, :y]</code> versus <code>[:alias, :y, :x]</code>) to encode information. If you wanted to add, for example, a line number to every sexp, you'd need to put the information in a form and at a particular position in every sexp, or you'd need to wrap every sexp in one that encodes the line number. The simplicity of sexps is beguiling, but you know what they say when all you have are s-expressions... Everything looks like function application.</p>
<p>On the other hand, we have this rich, object-oriented language that makes it trivially easy to create objects that conform to a consistent interface. We want to get from the Ruby text to Ruby objects as quickly, and simply, as possible.</p>
<p>One option would be to write the compiler so that it processes the MRI parse tree directly into bytecode. Unfortunately, that would require either rewriting the compiler in C (not gonna happen) or making the C data structs available in Ruby. The latter option sounds promising. We could translate the parse tree directly into an AST.</p>
<h3>Melbourne</h3>
<p>Enter Melbourne, a C extension that can run in MRI or Rubinius and process the MRI parse tree directly into an AST. In case you were wondering, it was named in honor of Evan's sydparse gem, and some rowdy Aussie developers we know.</p>
<p>In Melbourne, each node of the MRI parse tree is processed using a mechanism we all know and love: a method call. At each parse tree node, all children are processed by recursively calling the <code>process_parse_tree</code> function (see <em>lib/ext/melbourne/visitor.cpp</em>). At a leaf node, there are no children to create, so an appropriately named Ruby method is called on the <code>Rubinius::Melbourne</code> instance that is passed to <code>process_parse_tree</code>.</p>
<p>Once all of a node's children are created, a method is called for the parent node, passing along the objects that were already created for the node's children. You can see the result of this process by running the following command:</p>
<pre>$ bin/rbx -r compiler-ng -e '"a = 1".to_ast.ascii_graph'
LocalVariableAssignment
  @line: 1
  @name: a
  FixnumLiteral
    @line: 1
    @value: 1</pre>
<p>In <em>lib/melbourne/processor.rb</em>, you can see code like the following. Whenever a local variable assignment parse tree node (lasgn) is processed, this method is called and the LocalVariableAssignment AST node is created. Pretty straightforward.</p>
<pre>def process_lasgn(line, name, value)
  AST::LocalVariableAssignment.new line, name, value
end</pre>
<p>One of my goals for improving the code quality in the compiler was to replace the use of conditionals by creating more explicit nodes in the AST. Sometimes code suffers from what I'd call <em>conditionalitis</em>, or inflammation of your conditionals (sounds painful, huh?). That's where you maintain a lot of state and use conditionals to figure out what to do.</p>
<p>An alternative is to create different forms of things that just do what they are supposed to do. Simply avoiding using conditionals is not really an option, but <em>where</em> they're used can have a big impact on code quality. It's can be hard to understand by just looking at the code why one branch or the other would be taken when the program is running, but when the conditional results in different forms being created, it's much easier to comprehend how the program will behave</p>
<p>For example, in MRI <code>a.b = 1</code> and <code>a[b] = 1</code> are parsed into the same parse tree node. But there are enough things that need to be done differently in the two cases that just creating different forms can make these tasks simpler and more explicit. The code in <em>lib/melbourne/processor.rb</em> for processing an attrasgn node looks like this:</p>
<pre>def process_attrasgn(line, receiver, name, arguments)
  if name == :[]=
    AST::ElementAssignment.new line, receiver, arguments
  else
    AST::AttributeAssignment.new line, receiver, name, arguments
  end
end</pre>
<h2>B. The Compiler</h2>
<p>By this point we have a fully formed AST that faithfully represents the Ruby code we started with. Emitting bytecode now is almost anticlimactic. Not really, of course, because there is plenty of work left to do. Just keeping track of local variables in methods, blocks and evals could take up a whole post, but the basic idea is really simple. Start at the root node and call the <code>#bytecode</code> method, walking down the tree until every node has been visited. All the details (and there are a lot of them) can be found in the <code>#bytecode</code> methods on the AST nodes (see <em>lib/compiler-ng/ast</em>).</p>
<p>Besides generating bytecode, there are other interesting things you can do simply by defining methods on the AST nodes and visiting them. I'll give you two examples. First, let's look at <code>defined?</code>.</p>
<pre># defined.rb
class A
  class B
  end
end

def x
  puts "hey there"
  A
end

p defined? A::B
p defined? x::B</pre>
<p>If you run this code in Ruby, you should see the following output.</p>
<pre>$ ruby defined.rb
"constant"
hey there
"constant"</pre>
<p>This code illustrates something about <code>defined?</code>. It doesn't just check internal data structures. In some cases, like <code>x::B</code>, it must do some evaluation.</p>
<p>In Rubinius, we simply add a <code>defined(g)</code> method to any relevant AST node that takes an instance of the bytecode generator object and emits bytecode appropriate for evaluating the expression passed to <code>defined?</code>. Here is the full code for the <code>Defined</code> AST node.</p>
<pre>class Defined &lt; Node
  attr_accessor :expression

  def initialize(line, expr)
    @line = line
    @expression = expr
  end

  def bytecode(g)
    pos(g)

    @expression.defined(g)
  end
end</pre>
<p>Rather than emitting bytecode, it's possible to simply evaluate the AST, performing actions when visiting each node. Rubinius has an evaluator for being able to write bits of code for which there is no simple Ruby syntax. The evaluator makes it possible to write directly in Rubinius <em>assembly language</em> (i.e. in the operations that the VM directly executes). Consider this example:</p>
<pre># asm.rb
def hello(name)
  Rubinius.asm(name) do |name|
    push :self
    run name
    string_dup
    push_literal "hello, "
    string_dup
    string_append
    send :p, 1, true
  end
end

hello "world"</pre>
<p>If you run this in Rubinius, you should see the following output:</p>
<pre>$ bin/rbx asm.rb
"hello, world"</pre>
<p>Of course, this example is much easier to write in plain old Ruby. However, we use this facility in the Ruby core library. For example, in the <code>Class#new</code> method.</p>
<p>The basic idea is that a tree of Ruby objects representing Ruby source code is a powerful tool that can be easily extended to accomplish various tasks.</p>
<h2>Compiler, Part Deux</h2>
<p>Let's revisit the list we presented earlier of compiler stages. How do we coordinate those stages? What if we need to insert a stage? Well, by creating more things we can name, and on which we can define behavior.</p>
<p>You can see the various stages laid out in <em>lib/compiler-ng/stages.rb</em>. By default, each stage know which stage follows it. You create an instance of the compiler by specifying the starting stage and the ending stage. Each stage has an interface with the preceding and following stage, with the compiler object itself, and with the object that performs the work for that stage. For example, if the compiler will compile a String to a CompiledMethod, it will consist of the StringParser, Generator, Encoder, and Packager stages.</p>
<p>The goal was to create an API for the compiler that made it simple to use programatically in a variety of circumstances. For example, using the compiler internally to compile a Ruby file when <code>#require</code> is called or in a command line script like <em>lib/bin/compile-ng.rb</em>. The command line script may give the option to show the data structures as the compiler is processing them. Here's an example:</p>
<pre>$ cat var.rb
a = 1
p a

$ bin/rbx compile-ng -AB var.rb
Script
  @name: __script__
  @file: "var.rb"
  Block
    @line: 1
    @array:
      LocalVariableAssignment
        @line: 1
        @name: a
        FixnumLiteral
          @line: 1
          @value: 1
      SendWithArguments
        @line: 2
        @name: p
        @privately: true
        Self
          @line: 2
        ActualArguments
          @line: 2
          @array:
            LocalVariableAccess
              @line: 2
              @name: a

============= :__script__ ==============
Arguments:   0 required, 0 total
Locals:      1: a
Stack size:  3
Lines to IP: 1: 0-4, 2: 4-14

0000:  meta_push_1
0001:  set_local                  0    # a
0003:  pop
0004:  push_self
0005:  push_local                 0    # a
0007:  allow_private
0008:  send_stack                 :p, 1
0011:  pop
0012:  push_true
0013:  ret
----------------------------------------

$ bin/rbx var.rbc
1</pre>
<p>Take a look at <code>bin/rbx compile-ng -h</code> for more options and try this out on your Ruby code. It opens up a pretty exciting world of understanding.</p>
<h2>A Final Note</h2>
<p>Melbourne and the new compiler will both be enabled in the up-coming 0.13 release, so anywhere I've used <em>compile-ng</em> or <em>compiler-ng</em> in this post, will become just <em>compile</em> or <em>compiler</em>. Until then, all the code exists in the Rubinius <a href="http://github.com/evanphx/rubinius">Github repository</a>, so you don't have to wait to check it out.</p>
<p>Thoughts? Questions? Leave them here!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/improving-the-rubinius-bytecode-compiler/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>5 Things You&#8217;ll Love About Rubinius</title>
		<link>http://www.engineyard.com/blog/2009/5-things-youll-love-about-rubinius/</link>
		<comments>http://www.engineyard.com/blog/2009/5-things-youll-love-about-rubinius/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 17:30:44 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Garbage Collector]]></category>
		<category><![CDATA[LLVM]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2257</guid>
		<description><![CDATA[<p>When working on a project, contributors are constantly re-evaluating the pitch: "how do I explain to someone why what I'm doing is interesting?" The Rubinius team is no different. It's back to school season for a lot of you, so I've arranged my thoughts into a tidy back to school metaphor, looking at Rubinius through the eyes of its college roommate.</p>
<p><strong>1. We Take Out the Garbage</strong></p>
<p>No one likes cleaning up after a messy roommate, navigating around piles of junk or restarting your app servers constantly because  memory use grows without bound.</p>
<p>The Ruby language has automatically managed memory. In other words, the programmer doesn't worry about manually deallocating the memory for objects.  The memory manager is generally referred to as a garbage collector. However, not all garbage collectors are made equal.</p>
<p>Rubinius uses a generational garbage collector. Generational collectors are based on the idea that most objects live fast and die young. Generally, garbage collectors manage a collection (or <em>heap</em>) of objects. The generational garbage collector is a combination of two or more garbage collection algorithms.</p>
<p>An object is allocated in a heap based on the object's age. New objects are created in the young generation or nursery. If an object is still alive after the young generation collector runs a certain number of times, the object is promoted to the mature generation. The point of a generational collector is to reduce the amount of work the garbage collector has to do.</p>
<p>The Rubinius young generation uses a semi-space copying collector. The heap is split into two regions. Objects are allocated in one of the regions. When a region is full, all live objects are copied to the other region and new objects are allocated there until it is full. Allocations and collections continue in this flip-flop manner. The main advantage of this algorithm is that the collector only deals with live objects. If most of the objects are dying before the collector runs, it has much less work to do.</p>
<p>For the mature generation, Rubinius implements the <a href="http://bit.ly/immix-mark-region-collector">Immix Mark-Region</a> algorithm. The Immix collector has very fast allocation by simply incrementing a pointer rather than searching a free list. The Immix algorithm uses something called <em>opportunistic evacuation</em> for compaction.</p>
<p>Basically, it can move objects while marking the live objects. It uses the same incrementing-pointer allocation for objects that it is moving. The Immix paper is one of the most accessible academic papers on garbage collectors that you will find, so I highly recommend reading it.</p>
<p>There are two main things we still need to implement in the Rubinius garbage collector. In the young generation, objects like Bignum that use unmanaged memory internally have to be tracked down when they are no longer live so the unmanaged memory can be freed. Compaction for the mature generation also needs to be implemented.</p>
<p><strong>2. We Take "Dynamic" to the Metal</strong></p>
<p>We all know that Ruby is an extremely dynamic language. So why implement Ruby by statically compiling huge chunks of code that can never be changed while your program is running?</p>
<p><em>Dynamic compilation</em>, or just-in-time (JIT) compilation, is a strategy for converting source code to machine code that defers the decision to compile until some point when the program is running. The JIT may run just before a method is executed for the first time or it may run <em>after</em> a method has executed many times. The latter case is called profile-based JIT and the trigger can be, for example, the number of times a method is called or the time it takes a method to run.</p>
<p>The primary advantage of dynamic compilation is that it can potentially use information from your running program to generate better machine code that executes faster. The whole point is to create a version of the code that does less to get the same task accomplished and thereby runs faster.</p>
<p>The way this is done is by trading flexibility for speed. There are two main pieces of runtime information the JIT uses: the type of the object being sent a message at a particular location, and the number of times a method is called.</p>
<p>The most powerful technique in the JIT toolbox is method inlining. Inlining is basically copying the body of a method directly into the code that is calling the method. However, it's not the copying that is the point. Once methods have been inlined, there is more code for the JIT optimizer to work on. Redundant computations can be eliminated. The more code the optimizer can see, the more effectively it can do its job.</p>
<p>In Rubinius, the JIT compiler takes bytecode that was generated from Ruby source code and compiles it to machine code using <a title="LLVM" href="http://llvm.org" target="_blank">LLVM</a>. Rubinius has a lot of Ruby code. In Rubinius, the Ruby core library is mostly implemented in Ruby code. This means there is a lot of room for the JIT to do its work and make your program run fast. If a part of your program uses Hash heavily, for instance, the Rubinius JIT can inline Hash methods directly into your code, potentially making those heavy used areas extremely fast.</p>
<p><strong>3. We Play Nice</strong></p>
<p>There are many C extensions written for Ruby. While Rubinius and JRuby have been pushing the idea of using FFI to work with C libraries from Ruby code, there are situations where a C extension can be a big help. With Rubinius, we realize it would be a barrier to adoption if your existing code did not just work. We want to play nice with existing code.</p>
<p>There are two main components for supporting C extensions. The first is the C-API. Rubinius is not implemented like MRI, so we have to provide special functions to shim the C-API (for example, <code>rb_ary_new</code>). A lot of these are implemented by just calling Ruby methods in our core library using <code>rb_funcall</code>.</p>
<p>The second component is a bit more complicated. In MRI, the garbage collector never moves objects. Some C extensions consequently assume that they can keep a reference to a Ruby object in some global data structure. These C extensions would not be able to work with Rubinius if raw memory pointers to objects were given to the C extension because the Rubinius garbage collector <em>does</em> move objects. Furthermore, the garbage collector cannot know everywhere the C extension may stash a pointer, so it cannot update the reference with the new address of the Ruby object after it is moved.</p>
<p>Rubinius implements an object handle abstraction. The C extension is given C++ pointers that never change. These pointers are actually handles to the real Ruby objects. When the garbage collector moves an object, the C extension doesn't need to know about it. The C-API functions take care of converting handles to object references before doing their work.</p>
<p>The trade-off is that some C extensions may be slightly slower, but the benefit is being able to run practically any C extension that exists. Having running code is a big step in migrating to using FFI when it is reasonable to do so.</p>
<p>Mongrel, BigDecimal, Readline, Digest, YAML/Syck are some MRI C extensions that currently run in Rubinius. In a future post, I'll discuss our new parser, which is a C extension based on the existing MRI parser.</p>
<p><strong>4. We Are Organized</strong></p>
<p>How a project's source code is organized, both at the file system and in the source itself, can be a help or hindrance to a programmer attempting to understand or contribute to the project.</p>
<p>The Rubinius project can be divided into roughly four main components. These divisions are reflected in the organization of source files in the file system.</p>
<p>The Ruby core library (Array, Bignum, Hash, etc.) is located in the <code>/kernel</code> directory. The <code>alpha.rb</code> file and the <code>bootstrap</code>, <code>common</code>, and <code>delta</code> directories are loaded in alphabetical order. The majority of the code is in the <code>common</code> directory. This structure is used to support bootstrapping and to make it possible to share the Ruby core library with other implementations. The implementation-specific code is kept in the <code>bootstrap</code> and <code>delta</code> directories, while the more general code is in the <code>common</code> directory.</p>
<p>The Ruby standard library in the <code>/lib</code> directory. Most of this code is imported from MRI. The C extensions are located in <code>/lib/ext</code>.</p>
<p>The virtual machine, which includes the bytecode interpreter, garbage collector, JIT compiler, parser, and C-API, is located in the <code>/vm</code> directory and its subdirectories.</p>
<p>The bytecode compiler is <em>presently</em> located in the <code>/kernel/compiler</code> directory, but this will be moving to <code>lib/compiler</code> in the near future. Again, I'll be telling you about the parser/compiler refactoring soon.</p>
<p>Even though the majority of the core library is implemented in Ruby, there are some things that cannot be done in Ruby. These primitive operations are implemented in C++. Every Ruby class that requires primitive support has a corresponding C++ class with the same name. For example, there is an Array C++ class that implements the primitives for Ruby Array. These C++ classes are located in the <code>/vm/builtin</code> directory.</p>
<p>The correspondence between class names makes it easy to find your way around the code and easy to understand the VM code that uses Ruby objects.</p>
<p><strong>5. We'll Try Anything Once</strong></p>
<p>The goal of Rubinius is to be a fully compliant Ruby implementation that is very fast, stable, and reliable. Applying cutting edge research to make this possible requires experimentation.</p>
<p>Over time, Rubinius has changed from using C to using C++, changed the way primitives are implemented, rewritten the bytecode compiler, changed the bytecode interpreter execution model from stackless to using the C stack, changed the way exceptions are handled, added a custom JIT compiler, and replaced that with an LLVM-based one--just to name a few things.</p>
<p>Each time we made a change, the benefits of changing significantly outweighed the benefits of not changing. However, changing is never easy. We have been continually restructuring the code during these changes to make it easier to implement new features because that is the best and fastest way to get innovation implemented.</p>
<p>Here's an anecdote about how easy it is to work with Rubinius: the other day Ari Brown, a 17-year old high school senior from New Hampshire who first started contributing to Rubinius over a year and a half ago, thought it would be interesting to add <a href="http://en.wikipedia.org/wiki/Literate_programming">literate programming</a> support for Ruby. You can see his <a href="http://github.com/evanphx/rubinius/commits/literate">branch here</a>. It didn't really require that many modifications, but the cool thing is that Ari did it without any help from us. I think that speaks at least in part to the accessibility of the Rubinius codebase.</p>
<p>Do you have an idea you're itching to try out in Ruby? Clone the Rubinius repository and start hacking away. You get commit rights after your first patch is accepted. If you want to try something radical, you can push your work to a public branch. That way you can make your case by showing how it works in real code.</p>
<p>So that's it for my "good roommate" pitch. Feel free to get your parents' opinion before inviting Rubinius to stay the night. Once you get to know Rubinius, I'm sure it'll be a life-long friendship, and there's still time before Thanksgiving vacation to get your very own commit bit. As always, let us know what you areas of Rubinius you'd like us to better explain by commenting or finding us online.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/5-things-youll-love-about-rubinius/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rubinius: The Book Tour</title>
		<link>http://www.engineyard.com/blog/2009/rubinius-the-book-tour/</link>
		<comments>http://www.engineyard.com/blog/2009/rubinius-the-book-tour/#comments</comments>
		<pubDate>Thu, 02 Jul 2009 16:00:11 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Garbage Collector]]></category>
		<category><![CDATA[LLVM]]></category>
		<category><![CDATA[RSpec]]></category>
		<category><![CDATA[Rubinius]]></category>
		<category><![CDATA[Rubyspec]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=1424</guid>
		<description><![CDATA[<p>This year continues to be a hot one for the <a href="http://ruby-lang.org">Ruby programming language</a>. The <a href="http://bit.ly/eweek-ruby-use">use of Ruby is growing</a>, excitement is mounting for the release of Rails 3.0, and development of Ruby 1.9 and the alternative implementations is moving along quickly. It makes sense: bringing more value to your customers in less time with fewer resources is an obvious plus, and Ruby's a great way to make that happen.</p>
<p><a href="http://rubini.us">Rubinius</a>, which you've no doubt heard lots about over the last few years, is an implementation of the Ruby language written from scratch using cutting edge technology and the best industry research. Based on the questions we've received over the past few months, it's clear that a lot of folks are looking to learn more about the technologies behind the project. This is exciting because with so much written in Ruby, Rubinius positively <em>begs</em> Ruby developers to experiment and explore.</p>
<p>In this post I'll describe each of the basic parts of Rubinius, and provide some helpful links to books that I've found particularly useful in understanding how Rubinius is built.</p>
<p><br><br />
<span id="more-1424"></span></p>
<h2>Bytecode Virtual Machine (VM)</h2>
<p><strong></strong>Similar to Java, Rubinius runs your Ruby source code by first compiling it to bytecode and then executing that bytecode on the virtual machine. It's reasonable to think of a virtual machine as a CPU written in software. Virtual machines can be optimized to run very fast, which is one of the advantages over an interpreter like that used in Ruby 1.8.</p>
<p>A great deal of research on virtual machines has been done in the past 30 years, starting with Smalltalk and SELF and continuing with Java. <a href="http://blog.fallingsnow.net">Evan's</a> first prototype of Rubinius was written in Ruby and based on the Smalltalk-80 virtual machine. Understanding something about virtual machines is definitely the gateway to modern programming language implementations.</p>
<blockquote><p><strong>The Books:</strong></p>
<ul>
<li><a href="http://bit.ly/virtual-machines-book"><em>Virtual Machines: Versatile Platforms for Systems and Processes</em></a></li>
<li><a href="http://bit.ly/smalltalk-80"><em>Smalltalk-80: The Language and its Implementation</em></a></li>
</ul>
</blockquote>
<h2>Bytecode Compiler</h2>
<p><strong></strong>Compilers are one of the most useful tools in computer science and have probably been researched more than any other area since the 1950s.</p>
<p>The Rubinius bytecode compiler is written entirely in Ruby. Every method defined in your Ruby source code is compiled into an instance of the CompiledMethod class. The compiled method contains a list of bytecodes that essentially provides a blueprint for how to carry out the computation described in the source code.</p>
<blockquote><p><strong>The Books:</strong></p>
<ul>
<li><a href="http://bit.ly/engineering-a-compiler-book"><em>Engineering A Compiler</em></a></li>
<li><a href="http://bit.ly/dragon-book"><em>Compilers: Principles, Techniques, and Tools</em></a></li>
<li><a href="http://bit.ly/advanced-compiler-design"><em>Advanced Compiler Design and Implementation</em></a></li>
</ul>
</blockquote>
<h2>Ruby Core Library</h2>
<p><strong></strong>Almost all programming languages provide a library with useful data structures and other facilities. Some of the Ruby core library classes include Array, Hash, String, Regexp, Range, Float, Fixnum, Bignum, and Thread.</p>
<p>In Rubinius, this is again written almost entirely in Ruby with some VM-specific parts. For example, adding two Fixnums requires special VM support because Ruby as a language has no constructs for telling a CPU to access memory locations, treat them as machine integers, and add them together.</p>
<p>The <em>algorithm</em> is one of the most fundamental ideas in computer science. Basically, it is an ordered set of steps to perform to solve a problem or do some calculation. Data structures, like Array in the core library, are implemented using various algorithms. The algorithm used must provide the correct answer and must be reasonably efficient.</p>
<p>Working with the Rubinius core library is probably the easiest way to get involved since it's in Ruby! If you have experience with <a href="http://rspec.info">RSpec</a> in your Ruby or Rails projects, you'll feel right at home using <a href="http://rubyspec.org">RubySpec</a> to work on the core library BDD-style.</p>
<blockquote><p><strong>The Books:</strong></p>
<ul>
<li><a href="http://bit.ly/algorithm-design-manual"><em>The Algorithm Design Manual, 2nd Edition</em></a></li>
</ul>
</blockquote>
<h2>Garbage Collector</h2>
<p>Taking out the garbage is a fact of life. Rubinius includes a precise, generational garbage collector with a moving semi-space collector for the young generation and an implementation of the <a href="http://bit.ly/immix-garbage-collector">Immix Mark-Region garbage collector</a> for the mature generation.</p>
<p>The performance of the garbage collector can have a big impact on how fast your code runs. With the change to the Immix collector and some improvements to the young generation, the percentage of time Rubinius spends in the garbage collector during a full RubySpec run dropped from nearly 50% to around 10%. Further improvement in this area is possible.</p>
<blockquote><p><strong>The Books:</strong></p>
<ul>
<li><a href="http://bit.ly/garbage-collection-book"><em>Garbage Collection: Algorithms for Automatic Dynamic Memory Management</em></a></li>
</ul>
</blockquote>
<h2>JIT Compiler</h2>
<p><strong></strong>A Just-in-time compiler generates native machine code from your source code <em>while</em> your program is running. Typically, the JIT compiles the code based on feedback about which parts are getting used the most. The parts that have been JIT compiled often run quite a bit faster than the VM. However, in a language as dynamic as Ruby, the JIT compiler can usually produce much more efficient code after the VM has run and gathered information about the code.</p>
<p>Rubinius uses the <a href="http://llvm.org">LLVM Compiler Toolkit</a> to implement the JIT compiler. The basic concepts from the compiler books above all apply here. Also, see the detailed <a href="http://llvm.org/docs/">LLVM documentation</a>.</p>
<p>That's it for the book tour intro to Rubinius. I'll be giving a more detailed talk at OSCON 2009 titled <a href="http://en.oreilly.com/oscon2009/public/schedule/detail/8480">Rubinius 1.0: The Ruby VM That Could</a>. Hope to see you there, or visit us in the <strong>#rubinius</strong> channel on irc.freenode.net to get involved!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/rubinius-the-book-tour/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What is RubySpec?</title>
		<link>http://www.engineyard.com/blog/2009/what-is-rubyspec/</link>
		<comments>http://www.engineyard.com/blog/2009/what-is-rubyspec/#comments</comments>
		<pubDate>Thu, 11 Jun 2009 15:00:28 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Duck Typing]]></category>
		<category><![CDATA[JRuby]]></category>
		<category><![CDATA[RSpec]]></category>
		<category><![CDATA[Rubinius]]></category>
		<category><![CDATA[Ruby on Rails]]></category>
		<category><![CDATA[Rubyspec]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=1205</guid>
		<description><![CDATA[<p>You might think that <em>What is the meaning of life?</em> is a tough question. But here's one that will give it a run for its money: <em>What is Ruby?</em></p>
<p>Ok sure, that comparison is hyperbole, but bear with me. Try this out in your irb session:</p>
<pre><code>&gt;&gt; Float("0.5") == "0.5".to_f
=&gt; true
</code></pre>
<p>That's reasonable enough. Imagine if it <em>weren't</em> true! But did you know that the Float() method converted the <code>"0.5"</code> text string to a Float object without calling the string's to_f() method?</p>
<p>What is the definition of Ruby in this situation? Is it that Float() returns a Float object for a validly formatted text string or that Float() does so without calling the string's to_f() method? Let's investigate the situation a bit more.</p>
<p>In Ruby, if you define an arbitrary object that you want to behave like a Float object, you define a to_f() method for your object. Then Float() will call that method on your object:</p>
<pre><code>&gt;&gt; s = "0.5"
=&gt; "0.5"
&gt;&gt; def s.to_f() 42 end
=&gt; nil
&gt;&gt; floaty = Object.new
=&gt; #&lt;Object:0x5eb190&gt;
&gt;&gt; def floaty.to_f() 0.5 end
=&gt; nil
&gt;&gt; Float(floaty) == Float(s)
=&gt; true
</code></pre>
<p>Now that <em>is</em> surprising. A lot of the elegance of Ruby comes from generally everything being an object. In some sense, <code>floaty</code> and <code>"0.5"</code> are just objects, so why does the Float() method treat them differently?</p>
<p>More importantly, should you rely on Float() <em>not</em> calling your string's to_f() method, or is that merely an implementation detail of MRI (Matz's Ruby Implementation)? This is the dilemma faced repeatedly by every alternative Ruby implementation.</p>
<p>Fortunately, we have a powerful tool to assist us.</p>
<p>The <a href="http://rubyspec.org">RubySpec project</a> is writing an <em>executable</em> definition of the Ruby programming language using <a href="http://rspec.info">RSpec-style</a> specs. The tremendous utility of the specs is that alternate Ruby implementations can run them to determine if they are building a compatible Ruby engine.</p>
<p>Presently, the specs contain over 33,000 precisely defined facets of Ruby behavior. The specs cover Ruby behavior across different platforms, operating systems, and versions of the Ruby language. The goal is to ensure that Ruby applications written to use the core Ruby features covered by the specs will run the same on any Ruby implementation.</p>
<p>RubySpec has been well-known in the community of Ruby implementers for almost two years. Every major Ruby implementation is using it. However, many Ruby programmers are just learning about it. RubySpec has a lot to contribute to the larger Ruby community. Recently, I explain some ideas about this in an <a href="http://blog.rubybestpractices.com/posts/gregory/006-brian-ford-rubyspec.html">interview</a> with <a href="http://blog.rubybestpractices.com/about/gregory.html">Gregory Brown</a>. Greg is starting a project, called <a href="http://pledgie.com/campaigns/4640">Unity</a>, to make the information contained in the RubySpecs more accessible to everyone.</p>
<p>Contributing to RubySpec is a great way to learn more about the Ruby programming language. At the same time, your contribution helps the alternate Ruby implementations and the Ruby ecosystem. In the past couple of weeks, contributors have added tons of fixes to the specs for Ruby 1.8.7 and 1.9. Check out their excellent work at the RubySpec <a href="http://github.com/rubyspec/rubyspec/tree/master">Github repository</a>.</p>
<p>I'll be <a href="http://opensourcebridge.org/sessions/13">speaking about RubySpec</a> at the upcoming <a href="http://opensourcebridge.org/">Open Source Bridge</a> conference June 17-19. If you have questions about RubySpec that you'd like me to address, please leave a comment.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/what-is-rubyspec/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Everything Has An Interface</title>
		<link>http://www.engineyard.com/blog/2009/everything-has-an-interface/</link>
		<comments>http://www.engineyard.com/blog/2009/everything-has-an-interface/#comments</comments>
		<pubDate>Tue, 05 May 2009 23:16:31 +0000</pubDate>
		<dc:creator>Brian Ford</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[RailsConf]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=426</guid>
		<description><![CDATA[<p>There is no technical topic closer to my heart than good user interface design. The reason is simple: making my way through the world often requires dodging many traps left by otherwise well-meaning folks who create very difficult interfaces. So I was thrilled to attend the <a href="http://en.oreilly.com/rails2009/public/schedule/detail/7073" rel="nofollow">UI Fundamentals for Programmers</a> talk at RailsConf by Ryan Singer of 37signals.</p>
<p>When we think of user interface design, we probably think first of digital artifacts: the web applications, phones, or computer applications we use. But user interface design is much broader than that.</p>
<p>For example, I was at a workshop recently and after lunch I wanted to ensure my bottle made it into the correct recylce bin of one of those three-part glass, plastic, trash combo things.  Unfortunately, instead of just tossing my bottle into the well-labeled bin, I spent half a minute trying to discern from the contents which was the correct bin. There were recycling symbols and other markings but no clear labels.  Maybe they had worn off. Regardless, those three bins were an interface for depositing waste.</p>
<p>While I'd like people to think as broadly as possible about user interface design, this talk focused on some important aspects of traditional digital interface design. According to Ryan, user interface design and application programming are closely related and interdependent. From the perspective of a programmer or designer, the user interface is simply another layer in the application. But to the user, the user interface is the only thing they interact with so to them it <em>IS</em> the application.</p>
<p>Good application architecture emphasizes the importance of interfaces between components over the implementation details inside components. However, when beginning work on an application, Ryan suggests the most important element is to construct a <em>model</em> for the application that makes sense to the user and allows for its implementation. To understand the importance of this model, Ryan points to the book <em>Domain Driven Design</em> by Eric Evans.</p>
<p>One of the main tasks for a UI designer is to determine how the application is split up into screens. A screen is the unit of conversation for the designer. Within the screen, the designer should start from the "inside-out", focusing first on the most important thing that needs to be displayed and adding the surrounding information.</p>
<p>Ryan selected Edward Tufte's books, particularly <em>The Visual Display of Quantitative Information</em>, as one of the best sources for understanding the principles of visual design, selection and use of color, and proportions for visual elements.</p>
<p>Ryan emphasized the importance of one principle in particular: the <em>least effective difference</em>. Essentially, this means using the smallest visual effect that will distinguish elements. By adding or lessening contrast between visual elements, we create a hierarchy of information on the screen to show the user what they need to focus on to accomplish a task.</p>
<p>This is just the tip of the iceberg of UI design. Hopefully it will encourage you to investigate further. Everything has an interface and the better they are designed, the happier we'll all be.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/everything-has-an-interface/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

