<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Engine Yard Blog &#187; Evan Phoenix</title>
	<atom:link href="http://www.engineyard.com/blog/author/evanphoenix/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.engineyard.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 07 Feb 2012 19:36:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Ruby, Concurrency, and You</title>
		<link>http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/</link>
		<comments>http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/#comments</comments>
		<pubDate>Fri, 14 Oct 2011 16:41:29 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[1.8]]></category>
		<category><![CDATA[1.9]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[GIL]]></category>
		<category><![CDATA[implementations]]></category>
		<category><![CDATA[IronRuby]]></category>
		<category><![CDATA[JRuby]]></category>
		<category><![CDATA[MacRuby]]></category>
		<category><![CDATA[MagLev]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[Rubinius]]></category>
		<category><![CDATA[threads]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=10760</guid>
		<description><![CDATA[<table>
<caption>tl;dr</caption>
<thead>
<tr>
<th>Ruby Implementation</th>
<th>Concurrency</th>
<th>Parallelism</th>
</tr>
</thead>
<tbody>
<tr>
<td>MRI 1.8</td>
<td>✔</td>
<td></td>
</tr>
<tr>
<td>MRI 1.9</td>
<td>✔</td>
<td></td>
</tr>
<tr>
<td>Rubinius 1</td>
<td>✔</td>
<td></td>
</tr>
<tr>
<td>Rubinius 2</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>JRuby</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>MacRuby</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Maglev</td>
<td>✔</td>
<td></td>
</tr>
<tr>
<td>IronRuby</td>
<td>✔</td>
<td>✔</td>
</tr>
</tbody>
</table>
<p>A big topic in the world of Ruby this year has been how to get more out of Ruby, specifically, how to get more done in parallel. The topic of concurrency, though, is one fraught with misunderstanding. This is largely due to the complexities of not only thinking about multiple things at once, but the limitations of Ruby implementations and operating systems.</p>
<p>In this article, I’ll lay the groundwork for understanding the difference between concurrency and parallelism. Then, I’ll look at how a programmer experiences them.</p>
<h2 id="concurrencyvs.parallelism">Concurrency vs. Parallelism</h2>
<p>This has been discussed many times, but I sometimes still have difficulty with it. Let’s first break down the definitions of these two words:</p>
<ul>
<li><strong>Concurrent</strong>: existing, happening, or done at the same time</li>
<li><strong>Parallel</strong>: occurring or existing at the same time or in a simple way</li>
</ul>
<p>Hmm, ok. Well, that hasn’t improved our thinking about these two topics. We need to dig deeper into how the world of computing applies to these words. Rather than looking at the abstract, let’s instead consider some real world examples.</p>
<h3 id="arealworldexample">A “Real World” Example</h3>
<p>Let’s say you’ve sat down for the evening to complete tomorrow’s homework. This evening you’ve got both Math and History worksheets to fill out. Tonight for some reason, you decide to do one problem in Math, then one problem in History, then back to Math, etc until all the problems are done.</p>
<p>In the parlance of computing, you’re now doing your Math and History worksheets concurrently. This is because your <em>Current task list</em> includes 2 items: Math worksheet and History worksheet.</p>
<p>Now, clearly you the reader can see a problem here. By switching back and forth, completing your homework will probably take longer than if you did the complete Math worksheet then did the History worksheet. In other words, if you did the worksheets in serial.</p>
<p>So, if concurrent means “having multiple outstanding tasks at once”, then what is parallel? Parallel is the ability to make progress on multiple tasks simultaneously.</p>
<p>Let’s say you’ve been asked to read the book <em><a title="Amazon.com: One O'clock Jump (9780981944258): Lise McClendon: Books" href="http://www.amazon.com/One-Oclock-Jump-Lise-McClendon/dp/0981944256">One O’Clock Jump</a></em> by Lise McClendon. You also need to drive down to San Diego for Comic-Con. Thankfully you find that <em>One O’Clock Jump</em> is <a href="http://iambik.com/books/one-oclock-jump-by-lise-mcclendon/">available on audiobook</a>!</p>
<p>You can now listen to the book while driving. You’re simultaneously making progress on two separate tasks. This is the equivalent of parallelism in computing.</p>
<p>I hope that these real world examples help illustrate the difference between concurrency and parallelism. Now let's apply this newfound knowledge to Ruby.<span id="more-10760"></span></p>
<h2 id="backtoruby">Back to Ruby</h2>
<p>One reason this problem can be difficult to understand is because Ruby only provides a single mechanism for concurrency. But, whether or not these Threads are parallel depends on a number of factors.</p>
<h3 id="mri1.8">MRI 1.8</h3>
<p>Let’s look at MRI 1.8 (and MRI forks such as REE) to begin with, because it has the simplest model. MRI 1.8 uses a technique known as “green threads” to implement Threads. This means that every once in a while (around 100 milliseconds), the program says “oh, I should let another thread run now!” This saves the current info into the current thread and restores another thread. This is exactly like our homework example above. We can have as many things as we’d like in our task list, but we can only make progress on one of them at a time.</p>
<p>There is a wrinkle in the concurrency/parallelism game that I haven’t mentioned before now. This wrinkle is IO, namely how Threads interact when waiting for some external event. MRI 1.8.7 is quite smart, and knows that when a Thread is waiting for some external event (such as a browser to send an HTTP request), the Thread can be put to sleep and be woken up when data is detected. This simple consolation improves the usage of Threads so much that for a very long time the MRI 1.8.7 model was good enough for all Ruby programs.</p>
<h3 id="mri1.9">MRI 1.9</h3>
<p>Switching back to Ruby implementations, let’s look at MRI 1.9. As has been previously reported, MRI 1.9 removes the “green threads” we had in MRI 1.8 and uses native threads to implement the Thread class. Now, what are these “native threads”? These are are units of concurrency that the underlying operating system is aware of. A big reason to switch to use native threads is that it vastly simplifies the implementation of Threading. The operating system handles the low level parts of saving and restoring Thread information in a completely transparent way. Additionally, letting the OS know what parts of a program should be concurrent allows it to use the full resources of the computer to make that happen. In this modern world, that means using multiple cores.</p>
<p>Up until now, all we’ve talked about with Ruby’s Threading model was about concurrency, the ability to have multiple outstanding tasks at once. Now when we add in the idea of multiple cores, we can finally talk about parallelism. When a computer includes multiple cores (which is pretty much every computer now), those cores can run different code simultaneously, providing true parallelism. When a computer only has one core, there is no true parallelism, instead there is just simple concurrency, even at the OS level. The OS manages all the processes and threads in the system the same way you handled your Math and History worksheets, doing one for a little while, then grabbing another one.</p>
<p>Back to multiple cores though. Now that there is the opportunity to run things truly in parallel, we have to look at if Ruby can take advantage of that. Since MRI 1.9 uses OS threads, it can actually spread out your Ruby Threads to multiple cores!</p>
<p>Unfortunately, MRI 1.9 prevents the Ruby code itself from running in parallel by requiring that any thread running Ruby code hold a lock. This lock is commonly knows as the GIL (Global Interpreter Lock) or GVL (Global VM Lock).</p>
<p>There are a few reasons the GIL to exists, but for this discussion we will say that it’s because the non-Ruby parts of MRI 1.9 are not thread-safe. This means if data were manipulated by multiple threads at the same time, the data could become corrupt. The important thing for this post is how it applies to parallelism: the GIL inhibits parallelism within Ruby code.</p>
<p>MRI 1.9 uses the same technique as MRI 1.8 to improve the situation, namely the GIL is released if a Thread is waiting on an external event (normally IO) which improves responsiveness. MRI 1.9 also includes an experimental API that C extensions can use to run some C code without the GIL locked to utilize parallelism. This API is very restrictive though because no Ruby object may be accessed in any way while the GIL is not held by the current thread.</p>
<p>That about sums up the situation with MRI 1.8 and 1.9 with regards to concurrency and parallelism. Both provide concurrency of Ruby code, but neither provide parallelism of Ruby code.</p>
<h3 id="rubinius">Rubinius</h3>
<p>Let’s take a quick look at other Ruby implementations where things are a bit different than MRI. I’ll start with <a title="Rubinius : Use Ruby™" href="http://rubini.us">Rubinius</a>, since it’s the one I’m most familiar with. Rubinius 1.x also had a GIL and worked pretty much the same as MRI 1.9. With the upcoming 2.0 release though, the GIL will be removed, allowing Ruby code to run fully concurrent and fully parallel. We think this opens up a lot of uses for Ruby (parallel algorithms, etc) that Rubinius couldn’t handle well previously.</p>
<h3 id="jruby">JRuby</h3>
<p><a title="Home &amp;mdash; JRuby.org" href="http://jruby.org">JRuby</a> layers the Thread class on top of Java’s thread class, so the threading model is whatever the JVM supports. That being said, OpenJDK is the primary JVM; it puts a Java thread directly onto an OS thread with no GIL. Thusly, JRuby almost always has full concurrency and parallelism available to it.</p>
<h3 id="macruby">MacRuby</h3>
<p><a title="MacRuby &amp;raquo; Home" href="http://macruby.org">MacRuby</a> also uses Cocoa’s NSThread as its abstraction, which runs without a GIL. So, this is another fully parallel implementation.</p>
<h3 id="maglev">Maglev</h3>
<p><a title="MagLev - Ruby that scales" href="http://ruby.gemstone.com">Maglev</a> runs directly on top of a Smalltalk VM and thusly layers the Thread class on top of a concept called Smalltalk Processes. In this case, the GemStone VM implements Processes in the same way as MRI 1.8, namely via “green threads” that don’t expose concurrency to the OS, and therefore, have no parallelism.</p>
<h3 id="ironruby">IronRuby</h3>
<p>Lastly, <a title="IronRuby.net" href="http://ironruby.net">IronRuby</a> layers Thread directly on top of CLR’s threads without a GIL.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I hope that this helps to clear up what concurrency and parallelism are and how the different Ruby implementations address them. Having this understanding is critical for discussing and understanding topics such and thread-safety of libraries and performance of applications.</p>
<p>In future posts, we’ll look to build on this knowledge to help you make the best use of Ruby!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Rubinius on AppCloud</title>
		<link>http://www.engineyard.com/blog/2011/rubinius-on-appcloud/</link>
		<comments>http://www.engineyard.com/blog/2011/rubinius-on-appcloud/#comments</comments>
		<pubDate>Tue, 22 Mar 2011 18:06:18 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Product]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[appcloud]]></category>
		<category><![CDATA[Engine Yard Beta Program]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=8110</guid>
		<description><![CDATA[<p><!-- p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; min-height: 15.0px} --><a href="http://www.engineyard.com/blog/wp-content/uploads/Screen-shot-2011-03-22-at-9.34.37-AM.png"><img class="alignleft size-full wp-image-8222" title="rubinius_logo" src="http://www.engineyard.com/blog/wp-content/uploads/Screen-shot-2011-03-22-at-9.34.37-AM.png" alt="" width="135" height="123" /></a>I'm extremely proud to announce that Rubinius on Engine Yard AppCloud is now available in Alpha. This has been the culmination of years of work by countless people inside and outside of Engine Yard. Way back in 2007, when I first started at Engine Yard we began talking about this day.</p>
<p>A lot has changed in the intervening years with both Engine Yard and Rubinius but nothing has changed the focus on building Rubinius as a first class Ruby environment that is easy to use. Engine Yard has seen significant movement into the cloud/platform services realm and Rubinius has meanwhile begun to push the boundaries of Ruby performance. We've been building software and services to get us to this exact point in time and I couldn't be more excited. Rubinius now has all kinds of tools for making a developer's life easier, from builtin profilers to external monitoring.</p>
<h3>Enabling Access to Rubinius on AppCloud</h3>
<p>Now for the nitty gritty. We're going to be making Rubinius available in Alpha (as a part of our <a href="http://docs.engineyard.com/beta/home">Beta Program</a>). This means it will only be available to users that specifically request it. To request to have Rubinius available as one of your Rubies, <a href="http://docs.engineyard.com/beta/signup-rubinius">fill in this form</a> for each AppCloud account email. We'll flip a bit in your account and you'll have Rubinius as an option when configuring your instances.</p>
<h3>Support for Rubinius on AppCloud</h3>
<p>As a trial program, we want Alpha users to report issues back to us as quickly as possible. This will help us get them fixed and help smooth out the rough edges before Rubinius is moved into Beta, and then finally made available to all our users.</p>
<p>Please initiate Support requests and feedback via the <a href="https://groups.google.com/forum/?hl=en#!forum/ey-beta-talk">EY Beta discussion group</a>.</p>
<h3>Thank You</h3>
<p>On a final personal note, I'd like to thank all the people who have contributed to Rubinius over the past many years. Your help, no matter the size, has helped us get to this point. I can't wait to see where users take Rubinius on AppCloud!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2011/rubinius-on-appcloud/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Rubinius 1.2 Now Available!</title>
		<link>http://www.engineyard.com/blog/2010/rubinius-1-2-now-available/</link>
		<comments>http://www.engineyard.com/blog/2010/rubinius-1-2-now-available/#comments</comments>
		<pubDate>Wed, 22 Dec 2010 19:17:44 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=4997</guid>
		<description><![CDATA[<p><a href="http://www.engineyard.com/blog/wp-content/uploads/logo-rubinius.png"><img class="alignright size-full wp-image-5004" src="http://www.engineyard.com/blog/wp-content/uploads/logo-rubinius.png" alt="" width="50" height="50" /></a></p>
<p>Rubinius 1.2 was just released and is available at <a href="http://rubini.us">http://rubini.us</a>. There are a number of new features and improvements since 1.0.</p>
<h3>LLVM 2.8</h3>
<p>We've upgraded to using LLVM version 2.8, the latest released version. LLVM powers the high performance compiler Rubinius uses to compile Ruby code all the way down to machine code. This brings some minor performance improvements related to better optimizations, but this mostly paves the way for future high-level optimizations that we'll be implementing with new LLVM 2.8 features.</p>
<h3>Bytecode Verifier</h3>
<p>More and more people are beginning to use Rubinius as a platform for <a href="http://rubini.us/projects/">other languages, not just Ruby</a>. This is a development we're all super excited about, but it meant we needed to get a little more serious about making sure that bytecode can't crash the VM. Previously, we got along fine with no verifier simply because there was only one piece of code that generated Rubinius bytecode, the Rubinius compiler. Now with others also doing so, we needed to make sure they're generating valid bytecode. The bytecode verifier is a VM operation that is performed lazily, when a method is first invoked. It makes sure that the bytecode is consistent, for instance, only using the amount of stack it has requested and only using the proper number of local variables. With this new safety net in place, people can feel much more confident about generating their own bytecode without causing any hard crashes in the system.</p>
<h3>Memory Efficiency</h3>
<p>Ruby is being used for larger and larger software projects these days.  This makes how system memory efficiency very important. There are two measures of memory efficiency: growth stability and memory usage per object. Growth stability is largely a feature of the garbage collector, and is something that Rubinius has done quite well for some time now and so we focused on improving the memory usage per object.</p>
<p>Specifically, how an object stores its instance variables in memory. Because Ruby does not require instance variable declaration, the simplest way to model instance variables is with a hash. This is precisely what Rubinius used to do. The issue is for classes that have a small number of instance variables. In this case, the size of the hash table is substantial, needing more than 100 bytes of memory just to store one word (either 4 or 8 bytes)! And so we set about to try and reduce this overhead. Because Rubinius uses so many Ruby classes internally, we knew that a fix would have immediate benefits.</p>
<p>The new code is based upon an easily observable assumption about a class, namely that it defines the vast majority (usually all) of its methods before an instance of the class is created. We exploit this by running some code the first time an instance of a class is created which looks at all methods available to instances of this class. This means all methods defined in the class itself, its superclasses, and any mixed in modules. From the methods, we build a table of all instance variables those methods use.</p>
<p>Now we can construct a very good picture of how memory should be laid out for instances of this class, allowing us to store the instance variables in memory without needing a hash table. The memory usage typically goes from 100 bytes to 8 bytes on a 64bit machine. Quite a savings!</p>
<h3>Debugger</h3>
<p>A good debugger is invaluable when working on code of any kind. One of the big additions since 1.0 is the built-in debugger. Gems such as ruby-debug wouldn't compile, let alone work, because Rubinius doesn't share any internals with MRI.</p>
<p>We decided to take a different approach than most debuggers for languages. Typically, the debugger is delivered and used via some kind of command line interface only. We wanted a command line interface, but we didn't want it to be the only way into the system.</p>
<p>So, instead we built a <a href="http://rubini.us/doc/en/tools/debugger/">Debugging API</a> into the VM itself and built the CLI debugger on top of this API. This means it's available to be used by other projects that want to build new and innovative debuggers. In fact, the CLI interface we ship should be considered a kind of reference implementation. It's a bit short on features, but shows easily how to use the API and build upon it. We've already had people begin to port their debugger logic over to using the Rubinius API so that existing debuggers can be plugged into Rubinius simply.</p>
<p>Using the debugger is easy:</p>
<pre lang="ruby" escaped="true">require 'rubinius/debugger'
Rubinius::Debugger.here</pre>
<p>This will drop the code into the debugger at the .here method call, allowing you to inspect the call stack and objects on it. You can also use the <code>-Xdebug</code> option to rbx, which will start the debugger before loading the initial program, allowing you to set breakpoints before loading code.</p>
<p>For 1.2, we've introduced a special ruby-debug shim gem. This gem doesn't contain ruby-debug, but instead emulates the most common entrance point to it and invokes the built-in debugger. This means that projects such as Rails which have ruby-debug support integrated in work out of the box.</p>
<p>In future releases, we'll continue to improve on the debugging APIs as well as the CLI interface. So if you've got ideas for improvement, be sure to let us know!</p>
<p>Also we're looking for a list of projects that begin to add support for the debugging API. This includes frameworks like Rails and editors such as Emacs, VIM, Textmate, etc. If you're interested in adding support for your favorite project, let us know so we can help!</p>
<h3>Query Agent</h3>
<p>Query Agent (QA) is yet another tool that developers can use to debug and introspect their running programs. It provides the ability for the VM to export all kinds of low level data such as statistics about the garbage collectors. In addition to raw stats, it provides the ability to trigger functionality by reading and writing values.</p>
<p>For example, to get a live backtrace of all threads, simply read the system.backtrace variable. The values returned are calculated on request and thus reflects the current state of the system.</p>
<p>Implementation wise, Query Agent is a socket based API that is implemented directly by the VM. We opted to use BERT as the wire protocol, which allowed us to easily write a ruby client using the existing BERT encoder/decoder gem.</p>
<p>We hope that people will begin incorporating Query Agent support into their monitoring tools, allowing them to get very rich data and control of their ruby processes.</p>
<h3>Heap Dump</h3>
<p>Lastly, we have integrated a memory debugging tool directly into the VM. Heap Dump provides the ability to write out the entire object graph to disk in a stable, portable format. That file can then be read back in and analyzed. A very common analysis that is performed is simply to find out how many objects of each class exist in the system. This knowledge alone can help developers figure out object leaks that might exist in their code. There are currently two interfaces to Heap Dump, one in Ruby and one via the Query Agent.</p>
<p>In Ruby:</p>
<pre lang="ruby" escaped="true">Rubinius::VM.dump_heap("/path/to/file")</pre>
<p>and via the Query Agent:</p>
<p><code> set system.memory.dump /path/to/file<br />
</code></p>
<p>By having access via Query Agent, it become possible to debugging production processes offline.</p>
<p>We've only begun writing tools to analyze the dumps, which is available at<br />
<a href="https://github.com/evanphx/heap_dump">https://github.com/evanphx/heap_dump</a>.</p>
<h3>Beyond 1.2</h3>
<p>The team has been doing great since the 1.0 release, expanding Rubinius compatibility and improving performance. Coming up in the next few months, we've got three key features we're really excited about: 1.9 support, Microsoft Windows support, and true concurrency. These are big ticket items that we've been asked about a lot, and that will push Rubinius into more and more developers hands.</p>
<p>I'd like to thank everyone for all the support this year. It's been a wonderful year full of great releases. Seeing Rubinius continue to grow and blossom in 2011 should be even better!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/rubinius-1-2-now-available/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Engine Yard Alumni Grows; Bon Voyage Carlhuda</title>
		<link>http://www.engineyard.com/blog/2010/engine-yard-alumni-grows-bon-voyage-carlhuda/</link>
		<comments>http://www.engineyard.com/blog/2010/engine-yard-alumni-grows-bon-voyage-carlhuda/#comments</comments>
		<pubDate>Tue, 14 Sep 2010 16:09:53 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=4652</guid>
		<description><![CDATA[<p>Earlier this month we celebrated the long awaited release of Rails 3.  Mikel Lindsaar probably said it best when he called Rails 3 "the game changer [that] will revolutionize the web development industry".  Yehuda Katz and Carl Lerche have been tirelessly driving Ruby on Rails and its community for the past 20 months.  The entire Rails team really stepped up to the plate and gave us a great piece of software to continue on the Rails traditions.</p>
<p>Now it is time for Yehuda to broaden his horizons.  After 4 years working on Rails at Engine Yard he gets to focus on another one of his passions in JQuery and mobile development - an offer he couldn't refuse.  He will be a founding employee of a new startup in the mobile applications space, with Carl joining him.  They have both been passionate full-time members of the Rails team and we want to be the first to congratulate them on this new opportunity.  As always we are flushed with pride on the success of our Engine Yard Alumni!</p>
<p>We aren't saying goodbye to Yehuda though, he'll remain a part of the extended Engine Yard family.  Soon Engine Yard will be launching a Technical Advisory Board with Yehuda as a founding member. The Board will become a valuable forum for our customers, partners and other stakeholders to contribute feedback and perspective to the Engine Yard team to help drive the product features and capabilities. In addition to his adviser role on the new Board, Yehuda and Carl are going to continue working on Rails, beginning with the 3.1 release. We know that they'll continue to flourish, helping to push Rails forward as one of the best web development environments.</p>
<p>While we'll miss the daily presence of Yehuda and Carl, this change is well timed as we transition our focus from engineering to evangelism for Rails 3.0. We believe that Rails is the future of the web development industry and we want to ensure widespread and successful adoption of the framework. Additionally, Engine Yard will also continue to invest its Engineering talent into open source development throughout the Ruby/Rails ecosystem.</p>
<p>On a personal note, I'll deeply miss having Yehuda and Carl around as coworkers. I wish them all the best and know that they'll continue to find success in their new endeavor.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/engine-yard-alumni-grows-bon-voyage-carlhuda/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Making Ruby Fast: The Rubinius JIT</title>
		<link>http://www.engineyard.com/blog/2010/making-ruby-fast-the-rubinius-jit/</link>
		<comments>http://www.engineyard.com/blog/2010/making-ruby-fast-the-rubinius-jit/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 23:00:38 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[JIT]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3459</guid>
		<description><![CDATA[<p>In order to execute Ruby code as fast as possible, Rubinius has the ability to compile Ruby code all the way down to machine code when it detects that a method is heavily used. In Rubinius, the system that manages this process is its JIT.</p>
<p>In today's post, I'll be giving an overview of the various players involved in the path that code takes to get from source to machine code. Without further ado, I'll jump right in.</p>
<h3>Melbourne Parser</h3>
<p>This is the first step. The parser takes Ruby source code as input and calls for each element to create an internal representation of the code: the AST. (lib/ext/melbourne)</p>
<h3><strong>Compiler</strong></h3>
<p><strong> </strong>The compiler takes the AST that the parser created and analyzes it, creating bytecode in the form of a <code>CompiledMethod</code> and <code>InstructionSequence</code>. (lib/compiler)</p>
<h3>Bytecode</h3>
<p><strong> </strong>A <code>CompiledMethod</code> object contains an <code>InstructionSequence</code> object, which is the raw bytecode which will perform the semantic actions of the Ruby source code. (lib/compiler/iseq.rb)</p>
<h3><strong>Virtual Machine</strong></h3>
<p><strong></strong>The VM itself then executes the bytecode using a simple interpreter. A key data structure used in this evaluation is the <code>VMMethod</code>, which is an internal mirror of a <code>CompiledMethod</code>, but translated into constructs that are easier to interpret. As the VM interprets a <code>VMMethod</code>, it uses <code>InlineCache</code> objects to speed up method dispatch. In addition, these <code>InlineCache</code> objects remember profiling information about what methods they have seen. This information is later used by the JIT.</p>
<p>The VM also increments a call counter on the <code>VMMethod</code> at a few critical points (on start and on backward branch). This call counter is what controls when the JIT kicks in. When the call counter reaches some predetermined value (controlled via <code>-Xjit.call_til_compile</code>), the first stage of the JIT kicks in.</p>
<h3>Method Chooser</h3>
<p>Now that a call counter has reached the proper level, the JIT is ready to kick in. The JIT could simply take the method whose counter has hit the level, but instead it starts the search. It's looking for a good method to JIT, which it finds by looking up the call stack.</p>
<p>The reason it does this is because the JIT has the ability to inline methods into methods that call them. We're exploiting the fact that the call stack shows not just one method heating up, but a whole chain of them (<code>vm/llvm/jit:compile_callframe</code>). So we walk up the call stack, looking at each method along the way, and asking ourselves: <em>could this method be inlined into the one that called it?</em> If the answer is yes, we move to the next method. By doing this, we're able to inline methods along hot paths in code, which yields better speeds.</p>
<h3>Compiler Thread</h3>
<p>Now that we've picked a good method at which to start the JIT process, the method is placed into a queue. This queue is then automatically emptied by another native thread, which is always running. It's in this background thread that the rest of the JIT process takes place.</p>
<p>Using a background thread means that the Ruby code is free to continue to run while the JIT runs in the background. This means that the JIT imposes virtually <em>no</em> slowdown, because it never stands in the way of running Ruby code (<code>vm/llvm/jit:compile_soon</code>).</p>
<p>The JIT thread pops an entry off the queue and begins compiling it. Because the JIT uses <a href="http://llvm.org/" target="_blank">LLVM</a> to perform low level optimizations and machine code generation, we need to translate the method into a structure that LLVM understands. This structure is the LLVM IR.</p>
<p>To convert the method, we walk through the bytecode and call methods on a <code>JITVisit</code> object (one for each kind of bytecode). The <code>JITVisit</code> class uses LLVM's <code>IRBuilder</code> class to build a big tree data structure that represents the actions that should be taken.</p>
<p>A simple example is that a <code>goto</code> bytecode instruction is translated into a <code>Branch</code> object and inserted into the IR. For most bytecode, the process is fairly straightforward, there being a simple set of IR objects to generate per bytecode.</p>
<h3>Method Inlining</h3>
<p>The most complicated bytecodes to generate IR for are the send instructions. This is because it's at these points that we have the opportunity to inline a method. When a method is inlined, the code to perform the method is inserted where the send instruction would normally be. This eliminates any calling overhead and allows LLVM to optimize more.</p>
<p>At this stage, control is handled to an <code>Inliner</code> object. The <code>Inliner</code> will only inline a method if it can see that the method was the the primary method called at a particular <code>send</code> instruction. Because Ruby is a dynamic language, any method call can always invoke a brand new method. But in reality, that happens rarely. Instead, most method calls always end up calling the same method over and over again. The profiling information that the <code>InlineCaches</code> have been gathering allows us to see that this is the case and perform inlining.</p>
<p>One constraint that the <code>Inliner</code> has is that it needs to avoid over-inlining. Over-inlining causes the generated function to become extremely large and slower than it would be if there were no inlining. To do this, the <code>Inliner</code> keeps track of the cost of a inlining. For every method that is inlined, the cost increases. When the cost reaches a threshold, no more inlining takes place.</p>
<p>Now that the <code>Inliner</code> has decided to go ahead and inline the method, it must insert a guard before the inlined code. This guard makes sure that the object is still of the type seen in the profiling information, and that therefore the inlined method code is the proper code to run. The generated IR for this looks something like:</p>
<pre>if(obj-&gt;class == profiled_class) {
 result = inlined_code_for_method_name;
} else {
 result = obj-&gt;send method_name, …;
}</pre>
<p>This allows Ruby code to continue to be dynamic, but exploits those points in the code where the dispatch is actually static.</p>
<p>After the <code>JITVisit</code> class has finished, control is handed off to optimize and generates machine code.</p>
<h3>LLVM Optimization</h3>
<p>Up to now, we've simply been constructing information to feed to LLVM. Now we actually hand over the IR to LLVM. The first thing LLVM does is run a number of optimization passes over the IR. This cleans up the IR and makes it quite a bit more efficient. At this stage, the IR can be reduced in size by five to ten times by remove redundancies and reordering.</p>
<h3>LLVM Code Generation</h3>
<p>Finally, the optimized IR is run through LLVMs code generator. This code generator is fairly complex, but its API is extremely simple. When the generator finishes, it returns a function pointer that can be called to execute the code. We put this function pointer into a special slot on the original <code>CompiledMethod</code>.</p>
<p>By default, this slot in the object holds a pointer to the function that implements the interpreter. By swapping them, any future calling of the method automatically uses the new JIT'd version.</p>
<h3>Deoptimization</h3>
<p>Because Ruby is so dynamic, there are cases where JIT'd code must be discarded. The primary example is where a method that was inlined is redefined. In this case, the VM keeps a table to know all methods that inlined the method being redefined. It resets those methods back to using the interpreter and tags the JIT'd code to be discarded. Because the method has been reset, it can now be JIT'd again later, incorporating the newly redefined method.</p>
<p>So those are the systems that interaction to make speed up Ruby. The process can achieve speeds up by as much as 10x over non JIT'd code. We've only begun to scratch the surface of the techniques we can use to strip even more dynamic aspects of the code away and make it faster. 2010 is going to be an exciting year.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/making-ruby-fast-the-rubinius-jit/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Memoization and id2ref</title>
		<link>http://www.engineyard.com/blog/2010/memoization-and-id2ref/</link>
		<comments>http://www.engineyard.com/blog/2010/memoization-and-id2ref/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 18:00:35 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Newsletter]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3608</guid>
		<description><![CDATA[<p><em>This article was originally included in the February issue of the Engine Yard Newsletter. To read more posts like this one, subscribe to the </em><a href="http://www.engineyard.com/newsletter"><em>Engine Yard Newsletter</em></a><em>.</em></p>
<p><em>In this series, Evan Phoenix, Rubinius creator and Ruby expert, presents tips and tricks to help you improve your knowledge of Ruby.</em></p>
<hr />The performance of a library or application is one of the key factors into getting it accepted, so it should come as no surprise that Ruby programmers have many different tricks they use to squeeze more performance out of their code.</p>
<p>One of the most common is memoization. This is the technique of calculating a value once, then saving the result and transparently substituting it for the code that calculated the original value.</p>
<p>Here's a short example:</p>
<pre>def size_of_universe
 @size ||= Universe.find.size
end</pre>
<p>Here, we've calculated the size of the universe and then saved the result into the @size ivar. This way, the next time <code>size_of_universe</code> is called, the previously calculated value is returned.</p>
<p>We've already gone over one of the simplest and most basic techniques, above. This technique uses the <code>||=</code> operator to run the right hand side if, and only if, the left hand side is not true. It's short and sweet, rarely confusing the user.</p>
<p>Another technique that has been seen in production code uses <code>ObjectSpace._id2ref</code>. While this is becoming a common technique, it has a number of problems that we'll look at today.</p>
<p>Here is an example of using this technique:</p>
<pre>obj = Universe.find.size
&nbsp;
eval &lt;&lt;CODE
def size_of_universe
 ObjectSpace._id2ref(#{obj.object_id})
end
CODE</pre>
<p>This technique is used frequently with metaprogramming, when you want to embed a specific object directly into a generated method. People use this technique because, at first glance, it removes any kind of data dependency on the generated code and obj. There is no ivar to make sure is in scope, no constant, etc. But, in fact, this technique masks some rather terrible bugs.</p>
<p>This technique basically uses the whole Ruby process as a big table, leveraging the ability to easily get the table index for an object and convert that table index back into the object.</p>
<p>The primary issue stems from the fact that Ruby is a garbage collected language. Even though the code has requested the <code>object_id</code> for an object, that is not enough to keep the object alive. So if the only reference to the return value from <code>#size</code> was <code>obj</code>, when this method returns, <code>obj</code> becomes garbage.</p>
<p>So what happens when you run <code>#size_of_universe</code> and <code>obj</code> has been garbage collected? Well, a few things can happen:</p>
<ol>
<li><code>id2ref</code> will raise a <code>RangeError</code>, saying that the id no longer points to an object.</li>
<li>A random object will be returned.</li>
</ol>
<p>The second scenario is probably the strangest, but this can be observed. This bizarre <code>_id2ref</code> behavior occurs because the return value from <code>#object_id</code> is actually the address in memory of the object itself. This means that when the GC runs and collects the object, and then the allocator puts another object in the same place (which is exactly what an GC does), whatever object happens to be there is returned. This is essentially the same as a hanging pointer bug in C.</p>
<p>Lastly, the implementation of <code>#_id2ref</code> varies wildly between different Ruby implementations, each having different performance and different potential bugs. Due to these factors, using <code>#_id2ref</code> in production is even more nebulous.</p>
<p>So what's a simple alternative?</p>
<pre>UNIVERSE_SIZES = [ ]
&nbsp;
idx = UNIVERSE_SIZES.size
UNIVERSE_SIZES &lt;&lt; Universe.find.size
&nbsp;
eval &lt;&lt;-CODE
def size_of_universe
 UNIVERSE_SIZE[#{idx}]
end
CODE</pre>
<p>This seems silly if there is just a single value in <code>UNIVERSE_SIZES</code>, but the expectation here is that you might be generating many methods with values that need to memoized. In the example above, we're storing methods in an Array that is in a constant, which will keep the value alive from a GC standpoint. This avoids the bugs that <code>#_id2ref</code> has.</p>
<p>So hopefully if you need to memoize, you won't use <code>_id2ref</code>. There are a number of alternatives, most of them are better than worrying about the bugs that <code>#_id2ref</code> can easily introduce.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/memoization-and-id2ref/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Iteration Shouldn&#8217;t Spin Your Wheels!</title>
		<link>http://www.engineyard.com/blog/2010/iteration-shouldnt-spin-your-wheels/</link>
		<comments>http://www.engineyard.com/blog/2010/iteration-shouldnt-spin-your-wheels/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 18:00:02 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Newsletter]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3205</guid>
		<description><![CDATA[<p><em>This article was originally included in the September issue of the Engine Yard Newsletter. To read more posts like this one, subscribe to the </em><a href="http://www.engineyard.com/newsletter"><em>Engine Yard Newsletter</em></a><em>.</em></p>
<p><em>In this series, Evan Phoenix, Rubinius creator and Ruby expert, presents tips and tricks to help you improve your knowledge of Ruby.</em></p>
<hr />Ruby is a rich language that believes there should be more than one way to express yourself—the many ways of counting and iterating are no exception.</p>
<p>Most Ruby programmers are familiar with the most common one:</p>
<pre lang="ruby">Integer#times
  100.times { |i| p i }</pre>
<p><code>Integer#times</code> counts from 0 up to 99, yielding the current number to the block. This a simple, expressive way to execute some code a number of times.</p>
<p>But there are cases where you want to start counting at a number other than 0, no problem:</p>
<pre lang="ruby">Integer#upto
  10.upto(20) { |i| p i }</pre>
<p>This prints out 10, 11, 12, until it hit 20. It increments by 1, and you'll notice it is inclusive, meaning that in this case we yield 11 items, not 10.</p>
<p>Going up is nice, but sometimes you need to go down, so use #upto's sister:</p>
<pre lang="ruby">Integer#downto.
  20.downto(10) { |i| p i }</pre>
<p>If you need a little more control over your iteration, you can use:</p>
<pre lang="ruby">Range#step
  (10..20).step(2) { |i| p i }</pre>
<p>This will print 10, 12, 14, 16, 18, 20.</p>
<p>Now, in this case, we've introduced a Range, which most Ruby programmers are familiar with. It is basically an object that expresses a beginning and an end — in this case, 10 and 20. Range has another trick up it's sleeve:</p>
<pre lang="ruby">  (10...20).step(2) { |i| p i }</pre>
<p>You'll notice the 3 dots instead of 2. This indicates that this range is exclusive of the end, not inclusive. So 20 is the terminator, but is not in the set of valid values itself.</p>
<p>Range also support #each:</p>
<pre lang="ruby">  (10..20).each { |i| p i }</pre>
<p>This works exactly the same as Integer#upto. I personally prefer Integer#upto, because I feel it expresses the operation better.</p>
<p>Another domain is counting on a collection. Before 1.8.7 and 1.9, there was pretty much only one method to help you with doing that: <span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace;line-height: 18px;font-size: 12px">Array#each_with_index.</span></p>
<pre lang="ruby">  [:foo, :bar, :baz].each_with_index { |sym, index| p [sym, index] }</pre>
<p>This prints out [:foo, 0], [:bar, 1], and [:baz, 2].</p>
<p>This is nice, but it's pretty limiting because the only place you've got that index is with simple iteration. Say you wanted to map the Array and take the position into account —  you'd have to do:</p>
<pre lang="ruby">  ary = [1, 3, 5]
  i = 0
  ary.map { |element| x = element * i; i += 1; x }</pre>
<p>It's kind of messy to just take the position into account. So with 1.8.7 and 1.9, Enumerator support was baked into most methods which makes this much simpler!</p>
<pre lang="ruby">  ary = [1,3,5]
  ary.map.with_index { |element, index| element * index }</pre>
<p>For those that haven't seen Enumerators yet, you're saying "Hey! Where did the block to map go!" Well there isn't one. <code>Array#map</code>, when passed no block, returns a Enumerator object. This object, when you call #each, calls the original method on the original object and passes the block along. To begin with, this provides external iteration, but it also gives Ruby a place to add iteration alteration methods, such as <code>Enumerator#with_index</code>. Now you never need to use a while loop again!</p>
<p>See you next time!</p>
<h2>Update</h2>
<p>1.8.7 is a bit inconsistent about when Enumerators are returned. You can instead do:</p>
<pre lang="ruby">ary.dup.map!.with_index { |e,i| ... } </pre>
<p>Or, as a commenter pointed out:</p>
<pre lang="ruby">ary.to_enum(:map).with_index { |e,i| ... } </pre>
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/iteration-shouldnt-spin-your-wheels/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Ruby Tips: Numeric Classes</title>
		<link>http://www.engineyard.com/blog/2010/ruby-tips-numeric-classes/</link>
		<comments>http://www.engineyard.com/blog/2010/ruby-tips-numeric-classes/#comments</comments>
		<pubDate>Thu, 07 Jan 2010 18:30:18 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Fixnum]]></category>
		<category><![CDATA[Newsletter]]></category>
		<category><![CDATA[Numeric Classes]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3142</guid>
		<description><![CDATA[<p><em>This article was originally included in the September issue of the Engine Yard Newsletter. To read more posts like this one, subscribe to the </em><a href="http://www.engineyard.com/newsletter"><em>Engine Yard Newsletter</em></a><em>.</em></p>
<p><em>In this series, Evan Phoenix, Rubinius creator and Ruby expert, presents tips and tricks to help you improve your knowledge of Ruby.</em></p>
<hr />Ruby's numeric classes form a full numeric tower, providing many kinds of representations of numbers and numerical representations. It contains at its core a very elegant pattern that allows classes to participate in the tower easily.</p>
<p>Lets say we want to add a new numeric class called Money, which contains the number of dollars and cents:</p>
<pre lang="ruby">class Money
  def initialize(dollars, cents=0)
    @dollars = dollars
    @cents = cents
  end

  attr_reader :dollars, :cents
end</pre>
<p>Now, lets say we'd like to have Money be able to interact with all integers nicely, with an integer representing a number of whole dollars. It's not too hard add a + method to do that:</p>
<pre lang="ruby">class Money
  def +(other)
    case other
    when Money
      Money.new(@dollars + other.dollars, @cents + other.cents)
    when Integer
      Money.new(@dollars + other.to_i, @cents)
    else
      raise ArgumentError, "Unknown type!"
    end
  end
end</pre>
<p>but we'd also like to be able to do:</p>
<pre lang="ruby">allowance = Money.new(5)more = 1 + allowance</pre>
<p>Trying this straight away, you'll receive a message about Money not being able to be coerced to a Fixnum. This gives you a hint as to how to allow Money to interact with Fixnum better. We need to teach Money how to interact with the rest of the numeric tower, which we do with just one method:</p>
<pre lang="ruby">class Money
  def coerce(other)
    [self, Money.new(other.to_i)]
  end
end</pre>
<p>now we can do <code>more = 1 + allowance</code> and we see that we get <code>#</code>.</p>
<p>Wonderful! Fixnum#+, seeing the argument isn't a Fixnum, uses the coerce protocol. This is a simple double dispatch protocol, which gives the argument the ability to change the values being operated on, then call the original method again. We simply return an array of the new values to use, here we convert the argument to a Money object, and then + is called again on the first element in the Array, passing the second as the argument.</p>
<p>Lets say we'd like "1 + allowance" to return 6 instead. Easy!</p>
<pre lang="ruby">class Money
  def coerce(other)
    [@dollars, other.to_i]
  end
end</pre>
<p>Now, for your homework, make Money work also with Floats! See you next time...
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/ruby-tips-numeric-classes/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Compiling Ruby: From Text to Bytecode</title>
		<link>http://www.engineyard.com/blog/2009/the-anatomy-of-a-ruby-jit-compile/</link>
		<comments>http://www.engineyard.com/blog/2009/the-anatomy-of-a-ruby-jit-compile/#comments</comments>
		<pubDate>Thu, 27 Aug 2009 17:04:49 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Compiler]]></category>
		<category><![CDATA[ParseTree]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2059</guid>
		<description><![CDATA[<p>The business of executing Ruby code is booming; with so many Ruby environments in development, there are just as many different ways of actually <em>running</em> code. We've been hard at work in the world of Rubinius, and over the last few months we've been focused on a new way of executing Ruby code by converting it to machine code at run-time—a Just-In-Time compiler.</p>
<p>In this post, I'll walk you through how Ruby code begins as text and is converted into bytecode. In a follow up post, I'll go from bytecode to the machine code itself.</p>
<p>So let's get started: in the beginning, there was Ruby code (and the compiler saw it was good)!</p>
<pre>def foo(a,b)
  a + b
end</pre>
<p>This is a simple file dot-rb file written to disk—the currency of Ruby developers. So first things first. To get this code into the environment, the code is parsed into a tree of Node objects. In Rubinius, this is typically done using the parser we've imported from 1.8.6. This tree of Node structures is a C data structure, and it's what 1.8 uses for direct execution, simply reading the tree and running code based on the structure of this tree.</p>
<p>The issue with directly executing a tree structure is that it's fairly inefficient. There's no simple way to move around the tree, and you get a lot of redundant operations performed at the CPU level to execute even the most trivial of Ruby code.</p>
<p>So what we want to do instead is lower this tree to bytecode, which is an easier representation to execute efficiently. We start by translating the tree of Nodes in C into S-Expressions (sexps) using a variant of Ryan Davis's <a href="http://rubyforge.org/projects/parsetree/">ParseTree</a>. I'd be remiss if I didn't mention that sexps can also be produced by using Ryan's ruby_parser project, which is what Rubinius uses for bootstrapping itself within MRI.</p>
<p>The neat part about this transform is that we've now got a representation of the program as normal Ruby objects. This allows us to write our bytecode compiler in Ruby itself, which is the phase that takes the sexp and transforms it into an internal tree of AST instances.</p>
<p>An astute reader will notice that not two paragraphs ago, we already <em>had</em> a tree of Nodes representation. This is an inefficiency in the current compiler, and one that Brian Ford is actively working to eliminate. The upside of the transform is that while it was a tree of Node structures in C before, we've now got a tree of normal ruby objects, which are much easier to operate on.</p>
<p>The Ruby AST is decorated with a lot of information concerning node position, effects on other nodes, etc. This information is used in the next phase, which uses the visitor pattern to flow downward from the root of the AST into its leaves. This is the bytecode generation phase.</p>
<p>Each node is responsible for generating bytecode for its representation, which it does by calling methods on a generator object. This has many advantages. It's trivial to pass down a special test generator, which  keeps track of all of the calls it receives for later comparison with the results of another test generator. This is how we test the compiler's bytecode generation.</p>
<p>At this stage, a bytecode generation object has called all these methods, and that object has retained the information in a simple list. The list contains the name of the bytecode to emit and its operands (the arguments to a bytecode). This information contains an additional level of indirection to make generation easier, such as the use of goto targets as normal Label objects.</p>
<p>Because an InstructionSequence must be a list of Fixnums, these Label objects are resolved into bytecode locations, a simple offset from the beginning of the method. An InstructionSequence represents the full bytecode for a method as Tuple of Fixnums.</p>
<p>And so after all those transforms and operations, we've turned</p>
<pre>def foo(a,b)
  a + b
end</pre>
<p>into</p>
<pre>#&lt;Rubinius::Tuple: 19, 0, 19, 1, 77, 0, 12&gt;</pre>
<p>or, rendered symbolically:</p>
<pre>[[:push_local, 0], [:push_local, 1], [:meta_send_op_plus, 0], [:ret]]</pre>
<p>It should be noted that because Rubinius uses normal Ruby objects for as much as possible, it's trivial to see this information for yourself. Here's my irb session to extract the previous values:</p>
<pre>&gt;&gt; cm = def foo(a,b); a + b; end
=&gt; #&lt;Rubinius::CompiledMethod foo file=(irb)&gt;
&gt;&gt; i = cm.iseq
=&gt; #&lt;InstructionSequence:0x142&gt;
&gt;&gt; i.decode
=&gt; [[:push_local, 0], [:push_local, 1], [:meta_send_op_plus, 0], [:ret]]
&gt;&gt; i.opcodes
=&gt; #&lt;Rubinius::Tuple: 19, 0, 19, 1, 77, 0, 12&gt;</pre>
<p>You can also get a more assembly-oriented view:</p>
<pre>&gt;&gt; cm = def foo(a,b); a + b; end
=&gt; #&lt;Rubinius::CompiledMethod foo file=(irb)&gt;
&gt;&gt; puts cm.decode
0000:  push_local                 0    # a
0002:  push_local                 1    # b
0004:  meta_send_op_plus          :+
0006:  ret
=&gt; nil</pre>
<p>So we've now got this list of Fixnums, but we're missing one part of the equation. A list of Fixnums is data poor—you can't represent much using that. So associated with each InstructionSequence within the CompiledMethod is a Tuple called the literals Tuple. It holds objects that can be directly referenced by the bytecode.</p>
<p>The simplest example of this is the <em>:meta_send_op_plus</em> opcode we see above. That 0 on the end says "the name of the method to send is located at position 0 in the literals Tuple." Lets take a look at the literals Tuple for this method:</p>
<pre>&gt;&gt; cm.literals
=&gt; #&lt;Rubinius::Tuple: :+&gt;</pre>
<p>There we go. One entry for the symbol <em>:+</em>. Let's look at one more example:</p>
<pre>&gt;&gt; cm = def foo(a); p a, "hello evan"; end
=&gt; #&lt;Rubinius::CompiledMethod foo file=(irb)&gt;
&gt;&gt; cm.iseq.decode
=&gt; [[:push_self], [:push_local, 0], [:push_literal, 0], [:string_dup],
[:allow_private], [:send_stack, 1, 2], [:ret]]
&gt;&gt; cm.literals
=&gt; #&lt;Rubinius::Tuple: "hello evan", :p&gt;
&gt;&gt; puts cm.decode
0000:  push_self
0001:  push_local                 0    # a
0003:  push_literal               "hello evan"
0005:  string_dup
0006:  allow_private
0007:  send_stack                 :p, 2
0010:  ret
=&gt; nil</pre>
<p>In this method, we see two literals being referenced. The string <em>"hello evan"</em> is used pushed onto the stack by the <em>:push_literal</em> directly, and the <em>:send_stack</em> instruction indicates that position 1 contains the method to send. In this case, it's <em>:p</em>.</p>
<p>So there you have it: we've taken the text of a program and turned into a set of Ruby objects that the Rubinius VM can execute. In a later post, I'll cover how the VM interprets the bytecode to run the method, and we'll cover compiling this bytecode down to machine code that can be directly executed by the CPU.</p>
<p>We're hard at work, but always looking for new contributors and community members. If you like what you see, find us on IRC or leave a comment!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/the-anatomy-of-a-ruby-jit-compile/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Rubinius CPP Work Branch Change</title>
		<link>http://www.engineyard.com/blog/2008/rubinius-cpp-work-branch-change/</link>
		<comments>http://www.engineyard.com/blog/2008/rubinius-cpp-work-branch-change/#comments</comments>
		<pubDate>Wed, 29 Oct 2008 00:11:39 +0000</pubDate>
		<dc:creator>Evan Phoenix</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Rubinius]]></category>

		<guid isPermaLink="false">http://staging-wp.blog.engineyard.com/?p=108</guid>
		<description><![CDATA[<p>I'm super happy to announce that we've gotten the C++ branch stable enough that we're making it the default branch. This means that those of you with existing clones are going to likely do a little work to get them sane though.</p>
<p>Here is what was done:</p>
<p>* The old master branch was renamed shotgun.<br />
* The cpp branch was copied to the name master.<br />
* The cpp branch was then deleted.</p>
<p>Anyone that has up to now been working on the cpp branch has a couple of options.</p>
<p># Delete your clone and re-clone. This is the easiest. The default checkout will be code in the cpp branch and you're off and going.<br />
# Fix up your current repo. I did this by doing the following commands:<br />
** <code>git checkout master</code><br />
** <code>git reset --hard origin/master</code><br />
** <code>git branch -d cpp</code><br />
# This will get your local master branch repointed and properly checked out. In addition, the old cpp local branch can be deleted.</p>
<p>Hopefully no one experiences much pain due to this change. It's been a long time coming and I'm really excited.</p>
<p>If you do run into problems, post a comment or stop on by IRC and we'll work it out for you.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2008/rubinius-cpp-work-branch-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

