<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Engine Yard Blog &#187; Kirk Haines</title>
	<atom:link href="http://www.engineyard.com/blog/author/kirkhaines/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.engineyard.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 07 Feb 2012 19:36:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>What Are Events, Why Might You Care, and How Can EventMachine Help?</title>
		<link>http://www.engineyard.com/blog/2011/what-are-events-why-might-you-care-and-how-can-eventmachine-help/</link>
		<comments>http://www.engineyard.com/blog/2011/what-are-events-why-might-you-care-and-how-can-eventmachine-help/#comments</comments>
		<pubDate>Thu, 08 Dec 2011 20:13:30 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=9468</guid>
		<description><![CDATA[<p>You have likely heard of event based/driven programming, which you may also see referred to as "evented" programming. It has been around and in use for a long time, but it is seeing a growing swell of interest in recent years. Perl has had <a href="http://poe.perl.org/">POE</a> for years. Likewise, Python has had <a href="http://twistedmatrix.com/trac/">Twisted</a> for quite a few years. Graphics toolkits such as GTK use event loops to respond to user interface events. Javascript has been getting a lot of attention lately because of the <a href="http://nodejs.org">Node.js</a> event driven framework. Ruby has had the <a href="https://github.com/eventmachine/eventmachine">EventMachine</a> library since 2006, with other event driven programming libraries such as <a href="http://rubygems.org/gems/rev">Rev</a> and <a href="http://coolio.github.com/">Coolio</a> being released since then.</p>
<p>Despite this relative wealth of libraries, and growing interest in the event driven programming paradigm, this realm of software design is still shrouded in mystery and unknowns for many developers. People who are new to it tend to misunderstand it, often assuming magic that does not exist, or simply misunderstanding what "event driven" really means. Even developers with experience writing event based software using one of the previously mentioned libraries are often fuzzy on the details. They may assume that their library of choice represents the "real" world of event based programming. Or simply because their use cases only take them to a few familiar neighborhoods of evented programming, they may retain some of that newbie fog for other parts.</p>
<p>This series of articles is going to attempt to rectify some of those situations. The precise course of the articles will be determined as they progress, but for today, let's start at the beginning. Event Based/Driven Programming. It'a hot. All the cool kids are doing it. It's got EVENTS! Raise your hand if you truly know what that means, or why you should actually care.</p>
<blockquote>
<div style="font-weight: bold; font-size: 1.2em;">Event</div>
<div style="font-weight: bold;">–noun</div>
<ol>
<li>something that happens or is regarded as happening; an occurrence, especially one of some importance.</li>
<li>the outcome, issue, or result of anything: The venture had no successful event.</li>
<li>something that occurs in a certain place during a particular interval of time.</li>
</ol>
</blockquote>
<p>That quote was courtesy of <a href="http://dictionary.reference.com/browse/event">http://dictionary.reference.com/browse/event</a></p>
<p>Applying that definition to the world of software is a pretty direct thing, as it turns out. Event based programming is nothing more than letting the flow of the program be determined by some set of events. Hardware interrupts are an example of a ubiquitous source of events. On Unix systems, signals and the signal handlers that deal with them are a type of event based programming, as well. A typical pattern for windowing systems is to operate with an event based model; the software can't know when one is going to click on a menu item, or a dialog button, so the software instead runs in a loop, waiting for some event to happen. When an event occurs, the software calls into another piece of code to handle that event.</p>
<p>At its simplest, that general pattern of the system being divided into a dyad consisting of one part that detects or selects events, with a second part that handles them, is what event based programming is actually about. If you have been programming for any length of time, the odds are pretty good that at least in some small ways, you have engaged in event driven programming even if you didn't realize it.</p>
<p>If you go back and look at any of those libraries that I mentioned at the top of the articles, you will notice a trend. Each of those libraries is an event driven programming library, but there is a fair amount of variation across them. This is because <strong>event driven</strong> is a vague label, encompassing numerous patterns and feature sets. One of the most common of these patterns is the Reactor pattern.</p>
<p>The Reactor pattern describes a system that handles asynchronous events, but that does so with synchronous event callbacks. There are several ruby implementations of this pattern, including the most common library for event based programming in Ruby today, EventMachine. A reactor is good at handling many concurrent streams of incoming our outgoing IO, but because the callbacks are invoked synchronously, callback handling can severely impact the concurrency, or apparent concurrency, of a reactor implementation. Nonetheless, reactors are easy to implement, and with a little care, can be used to drive high performance IO on a single threaded, single process application.</p>
<p>As I mentioned, EventMachine is a Reactor implementation. And it is perfectly possible to install EventMachine, look at a few documents and a few examples, and start writing your own event based software that uses it without really having a good idea of how the machine is running under the hood. But there is value in understanding what a Reactor actually is, so that you better understand what a library like EventMachine is doing for you. Ruby gives us a lot of tools that make it reasonably easy to write a very simple pure ruby reactor implementation, so let's do that. It should make this topic much clearer when you see how simple it actually is. All of the code shown below can also be found on GitHub at <a href="https://github.com/engineyard/khaines_blog_code_examples/tree/master/what_are_events">https://github.com/engineyard/khaines_blog_code_examples/tree/master/what_are_events</a>. All of these pure ruby examples should work on every Ruby implementation. I have tried it on MRI 1.8.7_p352 and 1.9.3_p0, as well as JRuby 1.6.5 and Rubinius 2.0.0dev, though I did not extensively test on anything other than MRI 1.9.3, so quirks may exist.<span id="more-9468"></span></p>
<p>For our pure ruby reactor, we want only a few features.</p>
<ul>
<li>Unlike EventMachine, which also includes substantial support for managing creations of network connections, servers, and other more sophisticated activities, our reactor is going to limit itself to only being a tool for handling events. Therefor, all that it needs is a way to attach and detach IO objects to/from the reactor. We'll use the built in ruby mechanisms for everything else.</li>
<li>Ruby has the select() call available to it on all platforms, so our reactor will be designed to use it. The select() call returns readable handles, writeable handles, and errors from a set of filehandles to operate on, so those three events (<code>:read, :write, :error</code>) will likewise be all that our reactor handles.</li>
<li>Timers are very useful, and are pretty easy to implement in a reactor, so it would be nice to have a timer implementation.</li>
</ul>
<p>Even though it is not strictly necessary for a reactor implementation, I will start our implementation with the timer functionality. Timers are events which are time based. Their callback is triggered at some point after a given time threshold is reached. The difficulty with timers is in choosing a mechanism for storing them such that the ones which need to be triggered can be easily and efficiently detected. Time, however, is a sortable attribute, and there are some data structures that are great for storing sortable data where that sorted data order is important. There are tree based data structures which are very efficient at maintaining this sort of data. Ruby doesn't have one of those as a native data type, so for this example, I will just fake it. If I were writing a serious implementation, I would have to do more work to provide an efficient data structure for timer data. The following data structure is built on top of a hash, makes no claims to be efficient, and provides the bare minimum API for our reactor to have the tool that it needs to implement timers.</p>
<pre escaped="true">class SimpleReactor

  class TimerMap &lt; Hash
    def []=(k,v)
      super
      @sorted_keys = keys.sort
      v
    end

    def delete k
      r = super
      @sorted_keys = keys.sort
      r
    end

    def next_time
      @sorted_keys.first
    end

    def shift
      if @sorted_keys.empty?
        nil
      else
        first_key = @sorted_keys.shift
        val = self.delete first_key
        [first_key, val]
      end
    end

    def add_timer time, *args, &amp;block
      time = case time
      when Time
        Time.to_i
      else
        Time.now + time.to_i
      end

      self[time] = [block, args] if block
    end

    def call_next_timer
      _, v = self.shift
      block, args = v
      block.call(*args)
    end
  end
end</pre>
<p>Ok. Now let's start writing a reactor! Since we started with timers, we'll just write enough to make timers work. So, first, an #initialize method, and a method to add timers.</p>
<pre escaped="true">class SimpleReactor

  def initialize
    @running = false

    @timers = TimerMap.new
    @block_buffer = []
  end

  def add_timer time, *args, &amp;block
    time = time.to_i if Time === time
    @timers.add_timer time, *args, &amp;block
  end</pre>
<p>There's nothing interesting with the initialization. It just sets the <code>@running</code> instance variable false. This will be used in an upcoming bit of code. The method to add a timer also does nothing special; it just passes everything into a method of the same name in the TimerMap. The next part that is needed is the skeleton of our reactor. Here is what it looks like:</p>
<pre escaped="true">  def next_tick &amp;block
    @block_buffer &lt;&lt; block
  end

  def tick
    handle_pending_blocks
    handle_events
    handle_timers
  end

  def run
    @running = true

    yield self if block_given?

    tick while @running
  end

  def stop
    @running = false
  end

  def handle_pending_blocks
    @block_buffer.length.times { @block_buffer.shift.call }
  end

  def handle_events
  end

  def handle_timers
    now = Time.now
    while !@timers.empty? &amp;&amp; @timers.next_time &lt; now
      @timers.call_next_timer
    end
  end

  def empty?
    @timers.empty? &amp;&amp; @block_buffer.empty?
  end
end</pre>
<p>There you have it. A reactor skeleton, albeit one that only supports timers and next_tick right now. Here's an example that uses it:</p>
<pre escaped="true">require 'simplereactor'

puts &lt;&lt;ETXT
This demo will add a sequence of numbers to a sum, via a timer, once a second,
for four seconds, with the fifth number immediately following the fourth.
1+2+3+4+5 == 15. Let's see if that's the answer that we get."
ETXT

n = 0
reactor = SimpleReactor.new

reactor.add_timer(1) do
  puts "one"
  n += 1
end

reactor.add_timer(2) do
  puts "two"

  reactor.add_timer(1, n + 2) do |sum|
    puts "three"

    reactor.add_timer(1, sum + 3) do |sum|
      puts "four"
      n = sum + 4

      reactor.next_tick do
        puts "five"
        n += 5
        puts "n is #{n}\nThe reactor should stop after this."
      end
    end
  end
end

reactor.tick until reactor.empty?</pre>
<p>There's no real magic here. The code shows that one can create timers, and can create new timers within the callback code of existing timers, leveraging Ruby's block syntax. If you run this, you will get output like this:</p>
<pre escaped="true">This demo will add a sequence of numbers to a sum, via a timer, once a second, for four seconds, with the fifth number immediately following the fourth. 1+2+3+4+5 == 15. Let's see if that's the answer that we get.
one
two
three
four
five
n is 15
The reactor should stop after this.</pre>
<p>Take note of the last line in the example code -- <code>reactor.tick until reactor.empty?</code>. The reactor will not do anything until that line runs. That line sits in a loop, ticking our reactor repeatedly until there's nothing left for it to do. At that point, #empty? returns true, the loop terminates, and the program terminates.</p>
<p>The next step in this adventure is to add enough code to our reactor to do something useful with IO objects, as well. We need to be able to attach them to the reactor, detach them from the reactor, and put enough intelligence into the reactor to find events to respond to, and trigger the callbacks for those events.</p>
<p>First add a constant and some accessors, and change the #initialize method:</p>
<pre escaped="true">  Events = [:read, :write, :error].freeze
  attr_reader :ios

  def self.run &amp;block
    reactor = self.new

    reactor.run &amp;block
  end

  def initialize
    @running = false
    @ios = Hash.new do |h,k|
      h[k] = {
        :events =&gt; [],
        :callbacks =&gt; {},
        :args =&gt; [] }
    end

    @timers = TimerMap.new
    @block_buffer = []
  end</pre>
<p>This adds a hash for holding our IO objects. It has an initializer to hold an array of events that that the IO object will respond to, a hash of callbacks (potentially one per event type), and some set of args which can be passed to an invoked callback. A hash is also created to hold unhandled events, should they occur.</p>
<p>Next, let's add some methods to attach an IO object to the reactor, setup callbacks, and detach an IO object to the reactor.</p>
<pre escaped="true">  def attach io, *args, &amp;block
    events = Events &amp; args
    args -= events

    @ios[io][:events] |= events

    setup_callback io, events, *args, &amp;block

    self
  end

  def setup_callback io, events, *args, &amp;block
    i = @ios[io]
    events.each {|event| i[:callbacks][event] = block }
    i[:args] = args
    i
  end

  def detach io
    @ios.delete io
  end</pre>
<p>The code takes an IO object, a set of args to pass into the callback, and a block. It adds it to the @ios hash, and sets up the callback for the given events.</p>
<p>Next, we need to add a few small methods to enable triggering on IO events.</p>
<pre escaped="true">  def handle_events
    unless @ios.empty?
      pending_events.each do |io, events|
        events.each do |event|
          if @ios.has_key? io
            if handler = @ios[io][:callbacks][event]
              handler.call io, *@ios[io][:args]
            end
          end
        end
      end
    end
  end</pre>
<p>The #handle_events method is straightforward. If there are any attached IO objects, iterate through the events, calling the callbacks for each. In the existing code, we should never have unhandled events, but by adding that now, one could take this library and expand it more easily into a larger pure ruby reactor that handles types of events other that just what <code>select()</code> uses.</p>
<pre escaped="true">  def pending_events
    # Trim our IO set to only include those which are not closed.
    @ios.reject! {|io, v| io.closed? }

    h = find_handles_with_events @ios.keys

    if h
      handles = Events.zip(h).inject({}) {|handles, ev| handles[ev.first] = ev.last; handles}

      events = Hash.new {|h,k| h[k] = []}

      Events.each do |event|
        handles[event].each { |io| events[io] &lt;&lt; event }
      end

      events
    else
      {} # No handles
    end
  end

  def find_handles_with_events keys
    select find_ios(:read), find_ios(:write), keys, 0.01
  end

  def find_ios(event)
    @ios.select { |io, h| h[:events].include? event}.collect { |io, data| io } }
  end

  def empty?
    @ios.empty? &amp;&amp; @timers.empty? &amp;&amp; @block_buffer.empty?
  end</pre>
<p>This last bit of code just adds the nitty gritty methods that figures out if there are any events that need to be handled. It removes from @ios any handles that are closed, uses <code>select()</code> to find IO events, and then returns a hash of IO objects and the events that have been triggered on them. String all of this code together, and it is all that you need to have a basic working Reactor pattern for event based programming, with timer support. Here's a trivial example that uses pipes, to illustrate how it works.</p>
<pre escaped="true">require 'simplereactor'

chunk = "01234567890" * 30

reactor = SimpleReactor.new

reader, writer = IO.pipe

reactor.attach writer, :write do |write_io|
  bytes_written = write_io.write chunk
  puts "Sent #{bytes_written} bytes: #{chunk[0..(bytes_written - 1)]}"

  chunk.slice!(0,bytes_written)
  if chunk.empty?
    reactor.detach write_io
    write_io.close
  end
end

reactor.attach reader, :read do |read_io|
  if read_io.eof?
    puts "finished; detaching and closing."
    reactor.detach read_io
    read_io.close
  else
    puts "received: #{read_io.read 200}"
  end
end

reactor.add_timer(2) do
  puts "Timer called; the code should exit after this."
end

reactor.tick until reactor.empty?</pre>
<p>All that code does is to open a pair of pipes. The reactor attaches to one end as a writer, and the other end as a reader. The callbacks are used to send data from one end of the pipe to the other, where it is received. The Reactor will run for two seconds, then after the timer runs, the reactor will be empty, and it will exit. It looks like this:</p>
<pre escaped="true">Sent 330 bytes: 012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890
received: 01234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001
received: 2345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890
finished; detaching and closing.
Timer called; the code should exit after this.</pre>
<p>Here's another example. This one takes user input through the reactor via STDIN, and at the same time runs the stupidest, simplest web server possible. That web server will return a response that includes whatever the user input. The example code is also written to leverage the #run method defined in the library instead of cranking it ourselves.</p>
<pre escaped="true">require './simplereactor'
require 'socket'

server = TCPServer.new("0.0.0.0", 9949)
buffer = ''

puts &lt;&lt;ETXT
Type some text and press  after each line. The reactor is attached to
STDIN and also port 9949, where it listens for any connection and responds with
a basic HTTP response containing whatever has been typed to that point. These
two dramatically different IO streams are being handled simultaneously. Type
 to exit, or wait one minute, and a timer will fire which causes the
reactor to stop and the program to exit.
ETXT

SimpleReactor.run do |reactor|
  reactor.attach(server, :read) do |server|
    conn = server.accept
    conn.gets # Pull all of the incoming data, even though it is not used in this example
    conn.write "HTTP/1.1 200 OK\r\nContent-Length:#{buffer.length}\r\nContent-Type:text/plain\r\nConnection:close\r\n\r\n#{buffer}"
    conn.close
  end

  characters_received = 0
  reactor.attach(STDIN, :read) do |stdin|
    characters_received += 1
    data = stdin.getc # Pull a character at a time, just for illustration purposes
    unless data
      reactor.stop
    else
      buffer &lt;&lt; data
    end
  end

  reactor.add_timer(60) do
    reactor.stop
  end
end</pre>
<p>When this is executed, it attaches to STDIN, allowing one to provide input which is buffered internally. Any connection to port 9949 returns a simple HTTP response that contains the buffer that was created through STDIN. The process will run for 60 seconds, then the reactor will stop. Bearing in mind that this is a ridiculously trivial example, it does perform pretty well, too. Below is an excerpt from a test run, done using Ruby 1.9.3_preview1, on one of my older Linux machines.</p>
<pre escaped="true">Concurrency Level:      25
Time taken for tests:   1.60504 seconds
Complete requests:      15000
Failed requests:        0
Write errors:           0
Total transferred:      1275000 bytes
HTML transferred:       75000 bytes
Requests per second:    14144.22 [#/sec] (mean)
Time per request:       1.768 [ms] (mean)
Time per request:       0.071 [ms] (mean, across all concurrent requests)
Transfer rate:          1173.97 [Kbytes/sec] received</pre>
<p>Of course, this is an absurdly trivial example. There may be bugs, and it doesn't really do a lot for you, but it is an event reactor, written in pure ruby, and if you went through the code and examples as you read, you should have a better feel for what a reactor truly is.</p>
<p>If you want to write more sophisticated event based software, you could continue using a simple hand-rolled pure ruby reactor like this one, or you might choose to use one of the other common libraries for Ruby today. There are several of them, each with their own strengths and weaknesses, though the most common is EventMachine. Just like our simple reactor, EventMachine offers timers and asynchronous handling of events, though EventMachine's versions will scale better. This article's version uses the select() call, which limits code using it to 1024 open file descriptors. On the other hand, EventMachine, if used on a platform that support epoll (Linux) or kqueue (various *BSD platforms), can readily support at least 10s of thousands. EventMachine also offers a more rich set of features for implementing event based code than this article's example reactor. As a parting example, here is the HTTP server example from above, written to use EventMachine.</p>
<pre escaped="true">require 'rubygems'
require 'eventmachine'

module ServerHandler
  def initialize(buffer)
    @buffer = buffer
    super
  end

  def receive_data data
    send_data "HTTP/1.1 200 OK\r\nContent-Length:#{@buffer.length}\r\nContent-Type:text/plain\r\nConnection:close\r\n\r\n#{@buffer}"
    close_connection_after_writing
  end

end

module KeyHandler
  def initialize buffer
    @counter = 0
    @buffer = buffer
    super
  end

  def receive_data data
    @counter += 1
    if data.chomp.empty?
      EM.stop
    else
      @buffer &lt;&lt; data
    end
  end
end

puts &lt;&lt;ETXT
Type some text and press  after each line. The reactor is attached to
STDIN and also port 9949, where it listens for any connection and responds with
a basic HTTP response containing whatever has been typed to that point. These
two dramatically different IO streams are being handled simultaneously. Type
a blank line to exit, or wait one minute, and a timer will fire which causes the
reactor to stop and the program to exit.
ETXT

buffer = ''

EventMachine.run do
  EM.start_server('0.0.0.0',9949,ServerHandler,buffer)

  EM.attach(STDIN, KeyHandler, buffer)

  EM.add_timer(60) do
    EM.stop
  end
end</pre>
<p>And since I demonstrated the performance of the pure Ruby version, here's the performance of the EventMachine version (using EventMachine 0.12.10), running on the same system, using the same Ruby 1.9.3_preview1 installation.</p>
<pre escaped="true">Concurrency Level:      25
Time taken for tests:   0.696320 seconds
Complete requests:      15000
Failed requests:        0
Write errors:           0
Total transferred:      1275425 bytes
HTML transferred:       75025 bytes
Requests per second:    21541.82 [#/sec] (mean)
Time per request:       1.161 [ms] (mean)
Time per request:       0.046 [ms] (mean, across all concurrent requests)
Transfer rate:          1787.97 [Kbytes/sec] received</pre>
<p>The code is structured differently, but as you can see, it works similarly. In our simple reactor example, if you changed <code>data = stdin.getc</code> to <code>data = stdin.gets</code>, the STDIN handling would behave similarly to the EM examples STDIN handling. However, given the experience of writing a pure ruby reactor, even if you have never used EventMachine before, I think you can now look at that piece of EventMachine code and generally understand how it works, both at the level of the ruby code itself, and with a good idea of what EventMachine is handling for you. This basic understanding of how event driven software actually works is key to writing software that uses the event paradigm effectively.</p>
<p>I have focused on <a href="http://github.com/eventmachine/eventmachine">EventMachine</a> in this article because it is the most commonly used event reactor implementation in the Ruby world today. As I mentioned at the beginning of the article, however, there are other choices, such as <a href="http://coolio.github.com/">Coolio</a>. In future articles I will continue to focus on EventMachine, but I will try to include some examples from other frameworks, and also some examples focused on JRuby and Rubinius. Please let us know if there are particular topics that you would like to see discussed.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2011/what-are-events-why-might-you-care-and-how-can-eventmachine-help/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Updated Passenger Releases</title>
		<link>http://www.engineyard.com/blog/2011/updated-passenger-releases/</link>
		<comments>http://www.engineyard.com/blog/2011/updated-passenger-releases/#comments</comments>
		<pubDate>Tue, 30 Aug 2011 20:40:12 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Product]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Passenger]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=10203</guid>
		<description><![CDATA[<p>The other day while I was in the midst of a discussion about Linux kernel upgrades, one of the other engineers who works at Engine Yard, Scott Likens, sent me a github URL:</p>
<blockquote><p><a href="https://github.com/FooBarWidget/passenger/commit/4d765d71ea689c42ade897fc93851b8a8797e9c7">https://github.com/FooBarWidget/passenger/commit/4d765d71ea689c42ade897fc93851b8a8797e9c7</a></p></blockquote>
<p>It turned out that this patch had hit the Passenger github repo after our last set of ebuild releases, so I started investigating.</p>
<p>The patch fixes a concerning issue. Consider the simplest valid HTTP request for HTTP 1.0:</p>
<blockquote><p><code>GET / HTTP/1.0</code></p></blockquote>
<p>That request line alone is a simple, trivial HTTP request. With unpatched Passenger versions, a simple request of this nature returns a surprising response:</p>
<pre escaped="true"># curl -0 -H "Host:" http://MY.URL
curl: (52) Empty reply from server
</pre>
<p>The expected response would have been some sort of valid HTTP, but not only is there an empty reply from the server, which is bad enough in itself, but if you are on the server and pay attention to the list of processes, you will notice that this request causes a fault which kills the nginx worker process that was handling it. Nginx is an innocent victim in this case because it is the Passenger code that is at fault.</p>
<p>For a more visual example than curl provides, you could use a tool such as <a href="http://web-sniffer.net/">http://web-sniffer.net/</a>.  Do HTTP/1.0 without a Host header, and you'll see the same thing, regardless of the content of any other headers. Add the Host header back, and it works as expected.</p>
<p>It appears that the bug was caused by the SERVER_NAME patch, which was part of 3.0.8, and fixed a bug so that Rack::URLMap would work correctly.</p>
<p>You can use either of these tools to test your own applications.</p>
<p>This seemed like an important bug to have fixed in our nginx+passenger ebuilds here at Engine Yard, so after confirming the bug, and the patch, I built new versions of all of our relevant ebuilds to incorporate this patch. As of the time that this blog post was published, these builds have been live in our build tree for several days. AppCloud customers, you can update your own application to use these builds simply by clicking on the Upgrade button in your application dashboard. You may do so at your convenience, and can test that the patch fixed the problem by using either <code>curl</code> from the command line, or a web based tool such as the <a href="http://web-sniffer.net/">http://web-sniffer.net/</a> mentioned above. xCloud customers, you can request this update by filing a ticket with support.</p>
<p>We'd like to thank Phusion for all of their work on Passenger, and for fixing this particular bug. We look forward to the next release of Passenger, and we look forward to continuing to provide our customers with the most reliable systems that we can.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2011/updated-passenger-releases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Concurrency, Real and Imagined, in MRI; Threads</title>
		<link>http://www.engineyard.com/blog/2010/concurrency-real-and-imagined-in-mri-threads/</link>
		<comments>http://www.engineyard.com/blog/2010/concurrency-real-and-imagined-in-mri-threads/#comments</comments>
		<pubDate>Sat, 15 May 2010 17:00:59 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[MRI]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3649</guid>
		<description><![CDATA[<blockquote><p>In computer science, concurrency is a property of systems in which several computations  are executing simultaneously, and potentially interacting with each other. The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same processor, or executed on physically separated processors.</p></blockquote>
<p>-- <a href="http://en.wikipedia.org/wiki/Concurrency_%28computer_science%29">Wikipedia Concurrency article</a></p>
<p>Simply put, concurrency is when you have more than one logical thread of execution occurring simultaneously, or at least appearing to occur simultaneously. When you write software that makes use of concurrency, you want your software to do two or more things at once.</p>
<p>The motivations for using concurrency are varied. Sometimes you may have architectural reasons for using concurrency -- your code makes more sense to you or is easier to write if you conceive it in more than one discretely executing unit. In other cases you may want to employ concurrency in order to make better use of the multiple cores that many modern computers have, enabling you to get better total throughput out of your code than you would have from a non-concurrent implementation.</p>
<p>Whatever the motivation for employing concurrency, the reality is that concurrency is a complex subject. There are many different ways to achieve concurrency in software, and they each have their own set of tradeoffs. Furthermore, if your platform is Ruby, your decisions about what kind of concurrency to employ will be influenced by the specific Ruby implementation you are targeting. Each provides a different set of concurrency options for you to consider.</p>
<p>This is the first installment in a new series of articles focusing on introducing and exploring the variety of concurrency options available in the Ruby ecosystem. Advantages and disadvantages will be discussed for each, and I'll leave you with a few examples of how you can leverage these different options in your code. It should be a fun subject to explore!</p>
<p>Concurrency is all about multitasking -- doing more than one thing at once. The building blocks of multitasking are processes, threads, and fibers. Each of these components is complex in itself, both because of the nuances in how they interact and can be combined, and because different platforms have variations in which capabilities they implement and in how they are implemented. Luckily, their overall description can be summarized in a useful way.</p>
<p><strong>Processes</strong> are independent units of execution that generally share nothing with other processes, except for resources which are intended to be shared (such as shared memory segments, shared IO resources, or memory mapped files). Processes carry a lot of state information with them and have their own address spaces. Communication between them has to be through an interprocess communication mechanism provided by the platform that the processes are running in. Processes running on the same machine will be scheduled by the kernel, which will typically use some sort of time slicing algorithm to spread CPU usage of all running processes across the available cores.</p>
<p><strong>Threads</strong> come in several different flavors, including kernel, user space, and green threads. On some platforms there are entities called light-weight processes that bring kernel threads into user space so they look somewhat like processes, but are less expensive. For our purposes, threads are contained within a process, and share the memory space and process state of the process with each other. Green threads differ in that they are not controlled or scheduled by the operating system. Rather, they are provided by the process itself. This has a portability advantage because it means that the threads will be available on every platform that the process can run on, and will work the same on each. The main disadvantage is that green threads, being managed by the process itself, are generally confined to sharing a single core, and are limited to the peculiarities of the process's threading implementation (which may vary substantially from the platform's own threading implementation). Regardless of the type of threading, context switching with threads is generally faster than it is with processes.</p>
<p><strong>Fibers</strong> are like user space threads, except the operating system doesn't handle scheduling for them. Instead, fibers must be explicitly yielded to allow other fibers to run. This can have performance advantages like the reduction of system scheduling overhead. Since multitasking with fibers is cooperative, the need to use locks on shared resources is reduced or eliminated. Programmers can also leverage fibers to their advantage with IO operations by allowing other things to run while waiting for a slow or blocking IO operation.</p>
<p>Ruby concurrency isn't quite as simple as selecting one of the above and using it, however. In the beginning, there was just <strong>Ruby</strong>, a single implementation that everyone used. This Ruby implementation, now commonly called the Matz Ruby Implmenetation (MRI), saw a widespread usage explosion with the 1.8.x version. It's pretty old now. This is from the <a href="ftp://ruby-lang.org/pub/ruby/1.8">ftp://ruby-lang.org</a> FTP server:</p>
<pre>carbon:/home/ftp/pub/ruby/1.8$ ls -la | grep ruby-1.8.0
-rw-rw-r--  1 root     ftp   1979070 Aug  4  2003 ruby-1.8.0.tar.gz</pre>
<p>So, it has been around for a while, and offers a good starting point for discussing concurrency in Ruby.</p>
<p>MRI Ruby 1.8.x supports concurrency in a few ways. One of the first things newcomers to Ruby leap for are its threads. Depending on the language these newcomers were familiar with before arriving at Ruby, they may be in for a surprise. MRI Ruby 1.8.x provides a green thread implementation. As mentioned above, green threads do not make use of any threading system native to the platform. Instead, 1.8.x's threads are implemented within the interpreter itself. This leads to threads behaving consistently across any platform the interpreter runs on. Because they are green threads, however, they offer no advantages for CPU bound tasks.</p>
<p><strong>cpu_bound_threads.rb</strong></p>
<pre>require 'benchmark'
threads = []
thread_count = ARGV[0].to_i
iterations = ARGV[1].to_i
increment = iterations / thread_count.to_f
sum = 0

Benchmark.bm do |bm|
  bm.report do
    thread_count.times do |counter|
      threads &lt;&lt; Thread.new do
        my_sum = 0
        queue = (1 + (increment * counter).to_i)..(0 + (increment * (counter + 1)).to_i)
        queue.each do |x|
          my_sum += x
        end
        Thread.current[:sum] = my_sum
      end
    end

    threads.each {|thread| thread.join; sum += thread[:sum]}

    puts "The sum of #{iterations} is #{sum}"

  end
end
</pre>
<p>This is a simple program that takes a large range of numbers, divides them into smaller ranges, and hands each smaller range to a thread that calculates the sum of the range it was given. The results from each individual thread are then added together to arrive at a final answer.</p>
<p>All examples ran on an 8 core Linux machine. The numbers below are an average of the results of 100 runs for each set of inputs.</p>
<table>
<caption>Threads</caption>
<thead>
<tr>
<th>Iterations</th>
<th>50000</th>
<th>500000</th>
<th>5000000</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0.01730298</td>
<td>0.17149276</td>
<td>1.70610744</td>
</tr>
<tr>
<td>2</td>
<td>0.01724724</td>
<td>0.17179465</td>
<td>1.70557474</td>
</tr>
<tr>
<td>4</td>
<td>0.01729293</td>
<td>0.17181384</td>
<td>1.70570264</td>
</tr>
<tr>
<td>8</td>
<td>0.01741591</td>
<td>0.17210276</td>
<td>1.71201153</td>
</tr>
</tbody>
</table>
<p>As demonstrated by the numbers, MRI 1.8 threads are absolutely no help at all for a CPU bound application. In fact, there is a small but measurable cost to the overhead of managing them that is apparent in the numbers. As thread count increased, timing consistently and measurably slowed. If you are an MRI 1.8 user, do not despair; threads are but one concurrency option available to you.</p>
<p>An option that will better serve you for CPU bound tasks is process based concurrency. The idea is simple. In order to leverage multiple cores/CPUs, just create more than one process to handle the work load. Ruby provides a <code>fork()</code> method call which, on platforms that support it using the underlying <code>fork()</code> call from the C standard library. This will create a new process, with a new process ID, that can be considered an exact copy of the parent process, except that its resource allocations will be reset to 0.</p>
<p>Since processes do not share memory spaces, you must utilize another system provided communication mechanism in order to pass work to or from processes; this avoids the potential pitfalls that arise when trying to correctly manage locks on shared resources, but it does force one to think more specifically about exactly how to achieve communication.</p>
<p><strong>cpu_bound_processes.rb</strong></p>
<pre>require 'benchmark'
processes = []
process_count = ARGV[0].to_i
iterations = ARGV[1].to_i
increment = iterations / process_count.to_f
sum = 0

def in_subprocess
  from_subprocess, to_parent = IO.pipe

  pid = fork do
    from_subprocess.close
    r = yield
    to_parent.puts [Marshal.dump(r)].pack("m")
    exit!
  end

  to_parent.close
  [pid,from_subprocess]
end

def get_result_from_subprocess(pid, from_subprocess)
  r = from_subprocess.read
  from_subprocess.close
  Process.waitpid(pid)
  Marshal.load(r.unpack("m")[0])
end

Benchmark.bm do |bm|
  bm.report do
    process_count.times do |counter|
      processes &lt;&lt; in_subprocess do
        my_sum = 0
        queue = (1 + (increment * counter).to_i)..(0 + (increment * (counter + 1)).to_i)
        queue.each do |x|
          my_sum += x
        end
        my_sum
      end
    end

   processes.each {|process| sum += get_result_from_subprocess(*process)}

   puts "The sum of #{iterations} is #{sum}"

  end
end
</pre>
<p>In this example I used IO pipes to send data from the master process to the children, and to receive data from the children, back into the master.</p>
<p>As earlier, testing was done on an 8 core linux machine, with 100 runs of each test. The program is equivalent to the threaded version, and was changed only as necessary to enable it to be used in a multiprocess model instead of a multithread model.</p>
<table>
<caption>Worker Processes</caption>
<thead>
<tr>
<th>Iterations</th>
<th>50000</th>
<th>500000</th>
<th>5000000</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0.01805432</td>
<td>0.17199047</td>
<td>1.70812685</td>
</tr>
<tr>
<td>2</td>
<td>0.0098329</td>
<td>0.08675517</td>
<td>0.85509328</td>
</tr>
<tr>
<td>4</td>
<td>0.00609409</td>
<td>0.0446612</td>
<td>0.43100698</td>
</tr>
<tr>
<td>8</td>
<td>0.00847991</td>
<td>0.05346145</td>
<td>0.25621009</td>
</tr>
</tbody>
</table>
<p>Take a good look at these numbers. Everything moves in the correct direction, until you get to the 8 process column. Then timing slows for both the 50000 and 500000 iteration rows that are under the 4 process column. Do you have any theories as to why?</p>
<p>Processes are, in many ways, a great way to handle concurrency. One of their drawbacks, though, is that they are heavy structures. They can take up significant time and resources to create . Linux uses copy-on-write semantics when creating forked processes. This means it doesn't actually duplicate the address space of the forked process until pages in that space start changing. Then, it duplicates what changes. This means that forked processes on Linux can be created fairly quickly. However, MRI 1.8 is not very friendly to copy-on-write semantics.</p>
<p>If you are unfamiliar with the way memory is managed and garbage is collected in MRI 1.8, you should check out <a href="http://www.engineyard.com/blog/2010/mri-memory-allocation-a-primer-for-developers/">my article on MRI Memory Allocation</a>. One key aspect is that objects carry all of their status bits with them. This means that when the garbage collector scans the object space to find objects it can collect, it touches every object in the address space. For a process forked with copy-on-write semantics, this forces the kernel to make copies of all of those pages. This takes time, and largely negates the fast-creation benefit of copy-on-write forked processes.</p>
<p>The times for the lower iterations on the 8 thread test reveal a cost to this form of concurrency. The overhead associated with creating the forked processes overwhelms the performance gains from the division of labor when the work to be done is brief enough. This is a reality for any form of concurrency -- there is always a performance tax from some amount of overhead. That tax is just higher when spawning something heavy like a process. Keep this in mind when you explore concurrency options for your task.</p>
<p>These first two examples both represent CPU bound problems. Many real world problems are not CPU bound, though. Rather, they are IO bound issues. Because an IO bound problem has latencies imposed on it by something outside of the program itself, IO bound problems can provide an excellent case for using MRI 1.8's green threads to improve performance.</p>
<p><strong>io_bound_threads.rb</strong></p>
<pre>require 'net/http'
require 'thread'
require 'benchmark'

def get_data(url)
  tries = 0
  response = nil
  if /^http/.match(url)
    m = /^http:\/\/([^\/]*)(.*)/.match(url)
    site = m[1]
    path = m[2]
    begin
      http = Net::HTTP.new(site)
      http.open_timeout = 30
      http.start {|h| response = h.get(path)}
    rescue Exception
      tries += 1
      retry if tries &lt; 5
    end
  end
  response.kind_of?(Array) ? response[1] : response.respond_to?(:body) ? response.body : ''
end

mutex = Mutex.new
signal = ConditionVariable.new
thread_count = ARGV[0].to_i
fetches = ARGV[1].to_i
url = ARGV[2]
threads = []
count = 0
active_threads = 0

Benchmark.bm do |bm|
  bm.report do
    while count &lt; fetches
      while count &lt; fetches &amp;&amp; active_threads &lt; thread_count
        mutex.synchronize do
          active_threads += 1
          count += 1
        end
        Thread.new do
          get_data(url)
          mutex.synchronize do
            active_threads -= 1
            threads &lt;&lt; Thread.current
            signal.signal
          end
        end
      end

      mutex.synchronize do
        signal.wait(mutex)
      end
      while th = threads.shift
        th.join
      end
    end
  end
end
</pre>
<p>This script makes many HTTP requests. For simplicity's sake, lets say it just makes the same request over and over again, but could easily be expanded to take a list of URLs, and to do something useful with the returned data. The script uses threads much like the CPU bound example, except that it is a bit more sophisticated in how it counts the work it has assigned to generated threads, and how it waits for all the threads to be completed.</p>
<p>This table shows timing from it in action. The target URL used was not local to the testing machine. Each run used the indicated number of threads to gather the URL, either a "fast" URL, with an over-the-net response speed of about 35 requests per second, or a "slow" URL with an over-the-net response speed of about 3 requests per second, 400 times. There were 100 runs completed. The numbers below are an average from those runs.</p>
<table>
<caption>Worker Threads</caption>
<thead>
<tr>
<th>Request speed</th>
<th>35/second</th>
<th>3/second</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>6.53462668</td>
<td>61.1016239</td>
</tr>
<tr>
<td>2</td>
<td>3.34861606</td>
<td>30.4514539</td>
</tr>
<tr>
<td>5</td>
<td>1.38942396</td>
<td>12.1620945</td>
</tr>
<tr>
<td>10</td>
<td>0.72804622</td>
<td>6.0968646</td>
</tr>
<tr>
<td>20</td>
<td>0.47964698</td>
<td>3.0411382</td>
</tr>
</tbody>
</table>
<p>Just a glance at these numbers clearly shows that Ruby threads are a big help with an IO bound activity like this. The relationship between number of threads and reduction in time to complete the task is not linear; but even with up to 20 threads there is a significant benefit to additional numbers of threads.  The benefit is more linear, and evident for slower requests because the requests spend more time waiting on IO, and less on CPU bound activities.</p>
<p>There are some caveats to be aware of with regard to Ruby threads.  First, even though they are green threads, as soon as one starts sharing resources between threads, threading becomes something that can be hard to get right. Share as little as possible, thoroughly think through your code, and use tests to support your reasoning, because threading problems can be hard to diagnose and solve.</p>
<p>Second, MRI 1.8 has a limit on the number of threads that it will manage. As a consequence of how the internals are implemented, this means that on most systems (notably excluding win32 systems), total thread count is limited to 1024. Also, because of the way it is implemented, the overhead increases to manage a larger number of threads versus smaller. Each thread consumes a significant amount of memory, so do not go crazy with threads or it will backfire on you.</p>
<p>Third, because of the way that Ruby threading is implemented, it is possible for a C extension to Ruby to take control of the process and prevent Ruby from allowing context switches to other threads. It is possible to write extensions so that they do not do this, but many are not written in this way. Where this bites most people, is with code that interacts with a database. One can reasonably look at a database query as an IO bound activity -- all the Ruby process is really doing is sending a request to the DB and waiting for a response. However, most DB interaction libraries are implemented as C extensions, and some of them do not play well with Ruby threads. One of the most common offenders is Mysql-Ruby. It will block all of Ruby while waiting for the result from a long running query. This means that a long running query will block the whole process until it returns. On the other hand, Ruby-PG, the driver for Postgres, will context switch within <code>pgconn_block()</code>, the function that makes blocking calls to the database, thus permitting other MRI 1.8 threads to run even during a long running query.</p>
<p>Fourth, because MRI 1.8 threads are green threads, they all run inside the context of a single process and a single system thread. Thus, while they give the appearance of concurrency, there is actually only one thread running at once. This is okay, because it is the appearance of concurrency that matters. If you run <code>top</code> on your laptop or VM shell, you will see a large number of processes running on your system. This number will exceed the number of cores that you have by a large margin, but you rarely have to worry about which processes are actually running on one of the cores at any given time. Your kernel takes care of slicing up access to the CPU into fine enough grains that it appears that all the running processes are executing on a core at the same time (even though most of them probably are not actually running at any given time). Concurrency in computing doesn't strictly mean that two or more things are actually running at the same time. Rather, it means that there is an appearance that they are, and that one works with them on the assumption that they are, and lets the underlying scheduler deal with making reality fit that appearance.</p>
<p>An entire book could be written about concurrency in Ruby. I've just scratched the surface with this overview of process and thread based concurrency in Ruby. Hopefully this helped answer a few questions or suggested some techniques to consider. </p>
<p>Future installments in this series will cover Ruby 1.9.x (which uses system threads as opposed to green threads), JRuby, Rubinius, and using event systems like EventMachine to handle concurrency. So stay tuned! There is a lot more coming soon!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/concurrency-real-and-imagined-in-mri-threads/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>MRI Memory Allocation, A Primer For Developers</title>
		<link>http://www.engineyard.com/blog/2010/mri-memory-allocation-a-primer-for-developers/</link>
		<comments>http://www.engineyard.com/blog/2010/mri-memory-allocation-a-primer-for-developers/#comments</comments>
		<pubDate>Tue, 04 May 2010 08:00:45 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[1.8.6]]></category>
		<category><![CDATA[1.8.7]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[memprof]]></category>
		<category><![CDATA[MRI]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3647</guid>
		<description><![CDATA[<p>Memory allocation in the MRI 1.8.x series of Ruby is seen by many developers to be a black box. A developer writes code and the interpreter just does some magic to make sure that the memory for the code is allocated, and more importantly, eventually garbage collected. You don't have to think about, it or even care about it all that much.</p>
<p>And generally... that attitude is a productive one. The less you have to actively worry about the little details&mdash;like memory management&mdash;the more you can concentrate on the parts of the code that do the actual work. At the same time, though, a developer who remains ignorant of what's going on <em>under</em> the covers does so at his or her own peril. </p>
<p>It's very useful to have a general understanding of the mechanics involved, as they can sometimes steer you towards making better design choices in applications where your memory footprint matters; it's also very useful if things start going wrong with the memory footprint of the code. If your carefully built Rails application works a little bit like <a href="http://en.wikipedia.org/wiki/Mr._Creosote">Mr Creosote</a>, repeatedly misbehaving until it blows up, you need to have a basic understanding of what's going on with memory management in the interpreter.</p>
<p>There are two types of memory allocations that occur in MRI 1.8.x. First, objects are allocated on a heap, which is really just  a collection of <strong>slots</strong> that Ruby uses to store information about an object. When Ruby runs out of slots, and it can't free up any slots by running a garbage collection cycle, it will allocate a new heap for additional space. </p>
<p>The second type of allocation is when Ruby allocates memory off of the <strong>C</strong> heap to provide storage for the actual data contained within an object. This second type of storage is the most direct, and is the easiest to understand:</p>
<pre lang="ruby">foo = 'x' * (1024 * 1024 * 10)</pre>
<p>What actually happens there is that Ruby uses a slot out of its heap to store a <code>String</code>. The <code>String</code> implementation allocates, via a function called <code>xmalloc()</code>, enough memory to hold that <code>x</code>. <code>xmalloc()</code> is actually an alias, setup via a <code>#define</code> in the defines.h file.</p>
<pre>&#35;define xmalloc ruby_xmalloc
&#35;define xcalloc ruby_xcalloc
&#35;define xrealloc ruby_xrealloc
&#35;define xfree ruby_xfree
&nbsp;
void *xmalloc _((long));
void *xcalloc _((long,long));
void *xrealloc _((void*,long));
void xfree _((void*));
</pre>
<p>It does some error checking and runs a garbage collection cycle if allocations have exceeded a hard coded threshold (8000000 bytes), or if an allocation fails (meaning that the system lacks the RAM to fulfill the allocation request).</p>
<p>Then the <code>String#*</code> method creates a new <code>String</code> object (using another slot on the Ruby heap), calculating the size of the buffer by multiplying its own length (1 byte) by the number of repetitions (10,485,760). This buffer is allocated, as before, via <code>xmalloc()</code>.</p>
<p>You see that allocation as an immediate increase in RSS.</p>
<p>Try it in irb. Here's a <strong>ps</strong> line immediately after starting irb (using Ruby 1.8.7 on an OS X laptop):</p>
<pre>wyhaines 35539 0.0 0.1 602836 3060 s007 S+ 9:21AM 0:00.02 irb</pre>
<p>And here's the same thing on a Linux instance:</p>
<pre>wyhaines 20720 1.0 0.1 18360 3956 pts/1 S+ 06:49 0:00 irb</pre>
<p>I execute the following line in irb:</p>
<pre>foo = 'x' * (1024 * 1024 * 10); nil</pre>
<p>And here's the <strong>ps</strong> output for that process immediately afterwards:</p>
<pre>wyhaines 35539 0.0 0.3 613080 13332 s007 S+ 9:21AM 0:00.11 irb # OSX </pre>
<pre>wyhaines 20800 1.4 0.6 28604 14212 pts/1 S+ 06:51 0:00 irb # Linux</pre>
<p>You can see that the jump in RSS is directly tied to the amount of data that needed to be stored, which is expected, given that the memory was directly allocated in the <code>String</code> implementation. Any class implementation that has to allocate space for its own data storage will behave similarly. Some may use the <code>xmalloc</code> function from Ruby, while others may make use of <code>malloc</code> or related functions directly, or may have their own xmalloc-like function.</p>
<p>This type of allocation is easy to understand because it's <em>expected</em>. When an object that needs to hold 10Mb of data is created, there will be an allocation of 10Mb to store it. It <em>does</em> get a little more tricky when dealing with deallocation, since that should not happen until the object is garbage collected by Ruby, and unless you explicitly invoke a GC collection cycle, you can't really know when it is going to happen. Also, classes implemented in C or C++ can sometimes have bugs with deallocation, leading to RAM being left unexpectedly allocated. MRI Ruby's own <code>Array#shift</code> method once had a bug of this nature in it.</p>
<p>However, because this sort of allocation comes directly out of the C heap, when a deallocation occurs, you should immediately see it in your process size.</p>
<pre>irb(main):002:0&gt; foo = nil
=&gt; nil
irb(main):003:0&gt; GC.start
=&gt; nil
irb(main):004:0&gt;
</pre>
<p>A <strong>ps</strong> shows what happened:</p>
<pre>wyhaines 39596 0.0 0.1 602836 3092 s011 S+ 10:04AM 0:00.13 irb # OSX</pre>
<pre>wyhaines 20800 0.0 0.1 18360 3968 pts/1 S+ 06:51 0:00 irb # Linux</pre>
<p>The more tricky to understand allocation type is Ruby's management of its own heap space. Ruby maintains a series of heaps, which are just presized collections of <code>RVALUE</code> structures referred to as <strong>slots</strong>. Each slot is a little table (an <code>RVALUE</code>) that's used for keeping track of fundamental object data. On the MRI 1.8.x Ruby, a slot is about 20 bytes in size for a 32 bit build. I added a little instrumentation to a Ruby instance to show this:</p>
<pre>wyhaines$ /usr/local/rubyxxx/bin/irb
size of pointer to a heap: 12
length of the array that contains pointers to heaps: 10
  total size of the array of heap pointers: 120
Allocating heap of 10001 slots, each of 20 bytes
  malloc(200020)
Allocating heap of 18001 slots, each of 20 bytes
  malloc(360020)
</pre>
<p>On my 64 bit Linux instance, each <code>RVALUE</code> is 40 bytes, doubling the size of the Ruby heap.</p>
<p>In general conversation, when talking about Ruby's heap, we think of it as one big scratch space for storing object data. However, it's actually represented by a list of pointers to a collection of smaller spaces. Each of these individual spaces is a heap, and all of them together represent the process's total heap space.</p>
<p>By default, Ruby allocates a heap big enough to store 10000+1 slots on startup. After that first allocation, the number allocated on subsequent allocations is increased by a factor of 1.8 over the previous allocation. So the second heap allocation is for 18000+1 heap slots. The third is for 32400+1, and so on.</p>
<p>The theory is that as RAM usage grows, the likelihood of needing even more RAM increases, so allocating ever larger buckets hedges against needing to do a new allocation. As you can see in the above example, the initial chunk of 10k buckets isn't sufficient for running <code>irb</code>, so Ruby ends up allocating a second chunk of <code>10000 * 1.8 + 1 == 18001</code> object slots in the next chunk of heap.</p>
<p>The <code>RVALUE</code>s in the Ruby heap are a linked list. Ruby allocates space for them with a simple malloc:</p>
<pre>RUBY_CRITICAL(p = (RVALUE*)malloc(sizeof(RVALUE)*(heap_slots+1)));</pre>
<p>What that really does is to ask <code>malloc</code> to allocate a buffer that's the size of an <code>RVALUE</code> multipled by heap_slots+1, and then cast the pointer that <code>malloc</code> returns to an <code>RVALUE</code> pointer. There is some additional code to deal with error conditions. If <code>malloc</code> can not allocate the space, Ruby will set <code>heap_slots = HEAP_SLOTS_MIN</code>, which is normally hard coded to 10000, and then try again. If it fails again, it throws an error.</p>
<p>Once the space is allocated, Ruby does some housekeeping to make sure it stores the pointer in this new heap, and to increase the size of heap_slots for the next allocation, then it needs to go through the new heap space and to initialize the <code>RVALUE</code> structures.</p>
<pre>while (p as.free.flags = 0;
  p-&gt;as.free.next = freelist;
  freelist = p;
  p++;
}</pre>
<p>Even if you don't know C, you can probably figure out what's happening there. It's walking through each allocated struct, setting flags to 0 and establishing the linked list structure, with each slot pointing to the next one in the list. In doing so, it touches all of the heap space it just malloc'd. This has the side effect of forcing all of those pages into the resident memory of the process.</p>
<p>You can see that in operation with irb. To refresh your memory, here are a couple <strong>ps</strong> lines, from OSX, and Linux, for an irb process that has just been started:</p>
<pre>wyhaines 36996 0.0 0.1 602836 3060 s007 S+ 9:40AM 0:00.03 irb # OSX</pre>
<pre>wyhaines 20080 1.0 0.1 18360 3956 pts/1 S+ 07:48 0:00 irb # Linux</pre>
<p>Remember that just starting IRB creates a bunch of objects. It will already have gone through a couple heap allocations. So, we want to trigger a third. We also want to try to get close to catching it in action. So, in IRB, do this:</p>
<pre>a = []; 9000.times {a &lt;&lt; &#39;&#39;} # OSX w/ ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin9]</pre>
<pre>a = []; 5000.times {a &lt;&lt; &#39;&#39;} # Linux w/ ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]</pre>
<p>There's no magic there. I just did some trial and error experiments to figure out how many objects I needed to create to be close to the threshold of a new allocation. This number will vary some, depending on which Ruby you are using. A <strong>ps</strong> of the process will now look something like this:</p>
<pre>wyhaines 38195 0.0 0.1 602876 3132 s007 S+ 9:56AM 0:00.03 irb # OSX</pre>
<pre>wyhaines 20379 0.0 0.1 18476 4028 pts/1 S+ 08:05 0:00 irb # Linux</pre>
<p>RAM usage has grown a tiny bit, basically to accommodate the individual in-object allocations that happened when creating a whole bunch of tiny objects, but there have been no allocations of significant chunks of memory.</p>
<p>Now, go back to irb, and do this:</p>
<pre>1000.times {a &lt;&lt; &#39;&#39;}</pre>
<p>When you look at <strong>ps</strong> again:</p>
<pre>wyhaines 38195 0.0 0.1 603528 3776 s007 S+ 9:56AM 0:00.04 irb # OSX</pre>
<pre>root 20379 0.0 0.2 19744 5300 pts/1 S+ 08:05 0:00 irb # Linux</pre>
<p>It jumped by quite a big chunk. Doing some quick math, this third heap allocation would be for 32401 heap slots (18000 * 1.8 + 1). If each heap slot is 20 bytes, then (32401 * 20) == 648020 bytes needed. That looks pretty spot on for the RSS size bump that we observed with OSX. For the Linux system, it was already established that each <code>RVALUE</code> takes 40 bytes, so (32401 * 40) == 1296040 bytes, which also is a match for the jump that is seen.</p>
<p>As more objects are created by your code, more heap slots will be used. Ruby <em>does</em> reuse heap slots when object are garbage collected, and if all of the slots in a section of heap are freed, Ruby will free the entire section, but in most code that's pretty unlikely, meaning that the typical expectation is that when heap is allocated, it's going to <em>stay</em> allocated.</p>
<p>With the 1.8 scaling factor that's in MRI, here's a table to show you how much memory is allocated just for the heap as object counts increase:</p>
<table>
<tr>
<th>Threshold</th>
<th># of Slots</th>
<th>RAM w/ 20 byte RVALUEs</th>
<th>RAM w/ 40 byte RVALUEs</th>
</tr>
<tr>
<td>10000</td>
<td>10001</td>
<td>200020</td>
<td>400040</td>
</tr>
<tr>
<td>28000</td>
<td>18001</td>
<td>360020</td>
<td>720040</td>
</tr>
<tr>
<td>60400</td>
<td>32401</td>
<td>648020</td>
<td>1296040</td>
</tr>
<tr>
<td>118720</td>
<td>58321</td>
<td>1166420</td>
<td>2332840</td>
</tr>
<tr>
<td>223696</td>
<td>104977</td>
<td>2099540</td>
<td>4199080</td>
</tr>
<tr>
<td>412652</td>
<td>188957</td>
<td>3779140</td>
<td>7558280</td>
</tr>
<tr>
<td>752772</td>
<td>340121</td>
<td>6802420</td>
<td>13604840</td>
</tr>
<tr>
<td>1364988</td>
<td>612217</td>
<td>12244340</td>
<td>24488680</td>
</tr>
<tr>
<td>2466976</td>
<td>1101989</td>
<td>22039780</td>
<td>44079560</td>
</tr>
<tr>
<td>4450554</td>
<td>1983579</td>
<td>39671580</td>
<td>79343160</td>
</tr>
</table>
<p>As you can see, while those first few allocations are pretty small, they get large fast. With an <code>RVALUE</code> size of 20 bytes, the 10th allocation is about 38Mb, and if the <code>RVALUE</code> size is 40 bytes, that's about 76Mb.</p>
<p>When talking about Ruby memory allocations, that's about all that there is to it. However, allocations without deallocations eventually make a developer sad. With Ruby, there's no way to specifically deallocate an object. Deallocations are the job of the garbage collection system.</p>
<p>MRI Ruby implements a conservative mark and sweep garbage collector. This means that it operates by walking through memory, marking every object that it can find which is accessible at the current point of execution. After it finishes marking everything, it takes a second pass, collecting all of the marked objects.</p>
<p>Garbage collection can be invoked manually, via <code>GC.start</code>, but is typically invoked by Ruby. All of the garbage collection triggers are connected to the allocation behaviors of Ruby, and there are two mechanisms to be aware of, as a developer:</p>
<p>First, as I referred to near the beginning of the article, when the <code>ruby_xmalloc()</code> function runs, it looks at the total size of allocations from the C heap, and if they exceed a hard coded threshold (which defaults to 8000000 bytes), it will trigger a GC cycle. This means that if you have code which does a lot of C heap allocations, or does <em>large</em> C heap allocations, you'll be triggering garbage collection often.</p>
<p>The other main trigger occurs when a new object is created. Remember that each slot in the Ruby heap is used to store data about a single object. So when a new Ruby object is created, a slot on the heap is necessary. </p>
<p>Ruby maintains a linked list of all unused slots in its heaps. This list is called the <code>freelist</code>. <code>rb_newobj()</code>, in <strong>gc.c</strong>, creates new objects. However, it first checks to see if there's anything left in the freelist. If there isn't, it will first invoke <code>garbage_collect()</code>. </p>
<p>The garbage collection code will attempt to add some slots to the freelist by collecting and deallocating unused objects. If it fails, meaning that every object currently allocated on the heap is deemed to be in use, it will call <code>add_heap()</code>, with the effects we discussed earlier.</p>
<p>This second type of trigger is a very common cause of large processes. Imagine you have some code that queries a database with a query that pulls a large number of records. Maybe it does something cool, like pulling two sets of records, and then uses Ruby's set facilities to get a union of the two sets. It's all very slick, and works just fine. But then you notice that when the code runs, your process size immediately jumps by many megabytes, and it never goes down. What 's happened is that your queries created a very large number of temporary objects, and they exceeded the available space in the Ruby heap, so a new heap allocation was performed.</p>
<p>If all that went into this new heap were those temporary objects, then once they were garbage collected, the new heap could be deallocated after it was emptied. But remember what I said earlier: it only removes heap spaces if they're <em>empty</em>. So any new, longer lived object in that heap will anchor the whole thing into your process forever.</p>
<p>This behavior is important to be aware of, because it's one of the easiest ways that a developer can inadvertently bump their Ruby process size up higher than they want. While you shouldn't be paranoid about object creation, you also never want to create thousands or tens of thousands of temporary objects when you could've gotten the job done with hundreds, because in practice, those allocation thresholds are a one way street.</p>
<p>Also, be aware that any object with a C/C++ implementation that allocates its own memory <i>should</i> be deallocating that memory when the object is garbage collected. Every C/C++ extension should define a <code>*_free</code> function, which will be called when the object is garbage collected, and which is responsible for freeing any allocations that took place inside the extension's code. </p>
<p>Memory management in C is an easy place for a programmer to make errors though, so if your code is using an extension, and you're seeing strange memory behavior, it's usually a good idea to double check it. At least make sure that you are on the latest version, and that there are no known memory management related bugs with it.</p>
<p>The original outline of this article was actually written as a response to a customer's trouble ticket here at Engine Yard. He was seeing a large jump in the RSS size of his processes after they ran for a while, and we were trying to figure it out. The customer was seeing a sudden jump of about 43Mb in a long running process.</p>
<p>At the time, it was difficult to really pin the cause down. A sudden jump, when not doing anything extraordinary, fits the MO of a Ruby heap allocation, and the 10th allocation, if you refer to the table above, is <i>almost</i> that large&mdash;but any real serious debugging was going to require substantial work. </p>
<p>This has changed some in the last few months. Don't misunderstand; it's still a lot of work if you have to try to understand, in depth, the memory allocation/deallocation behavior of a complex piece of Ruby code, but now, Joe Damato and Aman Gupta have brought us <a href="http://github.com/ice799/memprof">memprof</a>.</p>
<p>The next time you're trying to understand why your program's RAM usage is doing something that seems strange, arm yourself with the background knowledge from this post, then go grab memprof.</p>
<p>It'll give you detailed information about your process's memory behavior, allocations, deallocations, and in depth details about all of the objects currently in your Ruby process's heap. It will give you all of the details to follow exactly what's happening inside that black box of allocation and deallocation, and given my personal experience in looking for the source of memory leaks and strange memory behavior, it can turn an all day job into a job that takes a half an hour.</p>
<p>Understanding the basics of your Ruby implementation's memory management isn't necessary to write Ruby code, but it's a good idea if you're writing and deploying substantial pieces of software. So, dig in and enjoy! They basics aren't too hard to understand. As always, happy to help answer questions here!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/mri-memory-allocation-a-primer-for-developers/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Varnish: It&#8217;s Not Just For Wood Anymore</title>
		<link>http://www.engineyard.com/blog/2010/varnish-its-not-just-for-wood-anymore/</link>
		<comments>http://www.engineyard.com/blog/2010/varnish-its-not-just-for-wood-anymore/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 17:00:52 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Varnish]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3646</guid>
		<description><![CDATA[<p>In previous posts, I've routinely mentioned a piece of software called Varnish. Varnish is a caching reverse proxy for web traffic, and if your job or your interests lean toward production web applications at all, you definitely want to get familiar with it. </p>
<p>This post isn't going to try to make a case for <em>using</em> a caching reverse proxy, as I think that's already sufficiently covered. Instead, it'll focus specifically on an overview of Varnish; what you need to do with it out of the box to implement decent caching for a typical web application, and some of the more sophisticated capabilities you'll want to get familiar with.</p>
<p>Varnish was written from the ground up to be a high performance HTTP accelerator. It leverages its host operating system's own memory management abilities and threading abilities in order to provide a high capacity, high throughput caching system. It has many <a href="http://varnish-cache.org/wiki/VarnishFeatures">features</a> that make it a nice tool, but it does avoid the massive feature bloat of many other caching proxy implementations. </p>
<p>Among these are load balancing and graceful handling of dead proxy back ends, a built-in Perl-esque <a href="http://varnish-cache.org/wiki/VCL">configuration language</a> that permits sophisticated behavior customization, url rewriting, and support for the most useful parts of <a href="http://en.wikipedia.org/wiki/Edge_Side_Includes">ESI</a>.</p>
<p>As previously mentioned, Varnish is threaded. Specifically, it manages a thread pool&mdash;or set of thread pools (as determined by configuration)&mdash;, and it uses one thread for each connection. This generally works well, but it <em>does</em> make you need to think about configuration a little bit. If you configure varnish to accept 20,000 concurrent connections, then it'll be running 20,000 system threads. Make sure you're on a system that isn't going to have its manhood threatened by that situation before you do it in production.</p>
<p>Installing Varnish is straightforward. It's available in most operating system package management systems, though it may not always be the most recent version; definitely check that you're getting an acceptably recent version from your package manager. You can also easily build it from source if you want to ensure that you have the most recent version. Simply <a href="http://sourceforge.net/project/showfiles.php?group_id=155816">download</a> the source from <a href="http://sourceforge.net">Sourceforge</a>. To build:</p>
<pre>./autogen.sh
./configure
make
make install</pre>
<p>With Varnish, building it isn't quite the end of the story. While it'll run out of the box with its build defaults for all parameters, it doesn't actually run very <em>well</em> that way. It'll deliver great performance to a point, but the defaults allow it to be overwhelmed pretty easily. Running a durable Varnish instance requires a bit of configuration love, and the command line configuration options are legion.</p>
<pre>varnishd -a :80 \
-b 127.0.0.1:81 \
-T 127.0.0.1:6082 \
-s file,/var/lib/varnish,100GB \
-f /etc/varnish/default.vcl \
-u nobody \
-g nobody \
-p obj_workspace=4096 \
-p sess_workspace=262144 \
-p listen_depth=2048 \
-p overflow_max=2000 \
-p ping_interval=2 \
-p log_hashstring=off \
-h classic,5000009 \
-p thread_pool_max=1000 \
-p lru_interval=60 \
-p esi_syntax=0x00000003 \
-p sess_timeout=10 \
-p thread_pools=1 \
-p thread_pool_min=100 \
-p shm_workspace=32768 \
-p srcadd_ttl=0 \
-p thread_pool_add_delay=1
</pre>
<p>The command line is outrageously long, but don't hyperventilate. You won't be typing this by hand in a production deployment anyway, because your startups are all scripted, right? I am not going to go over every one of these settings&mdash;the <a href="http://varnish-cache.org/">Varnish web site</a> has lots of getting started documentation to guide you when things get confusing. However, let's take a look at a few of the more interesting parameters that you should know about when configuring.</p>
<p><code>-a :80</code></p>
<p>The -a option provides a host and port for Varnish to listen to. If the host is omitted, the given port is listened to on all interfaces.</p>
<p><code>-b 127:0.0.1:81</code></p>
<p>This provides a single default backend for varnish to proxy to. It also accepts a HOST:PORT pair.</p>
<p><code>-s file,/var/lib/varnish/100GB</code></p>
<p>Varnish functions by allocating a system controlled area of memory to use to store the cached data. This can either be a malloc allocated area, specified by the keyword 'malloc', or an area backed by a file, specified by the keyword 'file'. The file variant uses mmap, while the malloc variant, with a large cache, will make use of swap space and the swapping subsystem.</p>
<p>If using the malloc type of storage, the only option that one provides is the amount of memory to allocate for the storage area. This is a number, in bytes, or a number in bytes suffixed by:</p>
<p>* K or k for kibibytes<br />
* M or m for mebibytes<br />
* G or g for gibibytes<br />
* T or t for tebibytes</p>
<p>This is pretty straightforward. Tune your cache to the amount of content that you have and the amount of space you have at your disposal. These cache spaces only persist for the life of the Varnish process, which means that if Varnish is killed and restarted, the cache must be populated anew. However, as of version 2.1.0, which was released on March 24th, there is now experimental support for persistent caches.</p>
<p><code>-p overflow_max=2000</code></p>
<p>When there are more accepted requests than there are threads to handle them, Varnish sticks them into an overflow queue. If the overflow queue fills up, and the listen queue (the size of which can be controlled with the <code>listen_depth</code> option) is full, then requests start getting dropped. </p>
<p>Requests that are just sitting there waiting to be handled take up space, so this parameter shouldn't be set absurdly high with no reason. That said, it needs to have a bit of a ceiling to allow for traffic spikes to occur without detrimental effects. This also lets it survive things like someone pointing 'ab' at the proxy and saturating it with requests.</p>
<p>Increasing the size of this parameter is one of the crucial changes from the default configuration which is necessary to help ensure a production capable Varnish deployment. If you don't change it, it's quite easy to DoS Varnish with something as ubiquitous as Apache Bench.</p>
<p><code>-p thread_pool_max=1000</code></p>
<p><code>-p thread_pools=1</code></p>
<p><code>-p thread_pool_min=100</code></p>
<p><code>-p thread_pool_add_delay=1</code></p>
<p>Taken together, these three parameters describe the thread pooling behavior for Varnish.  The <code>thread_pools</code> parameter is self describing. Generally, you probably want one pool per core.</p>
<p>The <code>thread_pool_min</code> parameter gives a bottom limit for the number of threads to maintain, per pool, regardless of traffic.  Don't keep this <em>too</em> low, or you may limit Varnish's ability to rapidly respond to traffic spikes when it's otherwise not very busy. At the same time, setting it <em>too</em> high just increases the amount of time the OS spends babysitting threads that aren't doing anything, so practice moderation.</p>
<p>When Varnish doesn't have enough threads in its thread pool(s) to handle the traffic, it creates new ones. In order to avoid swamping the system, there's a delay between the launching of each thread. The parameter that controls this is <code>thread_pool_add_delay</code>. This defaults to 20 milliseconds, but that's far too long to handle load spikes. The prevailing wisdom right now is to set it at one to two milliseconds.</p>
<p>Finally, Varnish has a limit on the total number of threads that it'll spawn. This is <code>thread_pool_max</code>. Pay attention here. The semantic is different between <code>thread_pool_min</code>, which is the minimum per thread pool, while <code>thread_pool_max</code> is the maximum, collectively. So, for example, if one has <code>thread_pools=4</code> and also has <code>thread_pool_max=1000</code> then that means that the entire Varnish process is limited to 1000 threads; this is not a per pool attribute. At the same time, if one had <code>thread_pool_min=100</code>, then there would be a minimum of 400 threads running at all times; that is 100 per thread pool.</p>
<p>These command line options just scratch the surface of what one can do with Varnish. Some of the best features of Varnish come from the Varnish Configuration Language. </p>
<p>This language, commonly called VCL, is a domain specific language that is used to customize Varnish's request handling and caching behaviors. In appearance, it is reminiscent of a Perl kept to the most C-like basics, but it is pretty easy to both read and use. VCL is also very fast because Varnish actually translates the VCL code into C and then compiles it into a shared object, on the fly, so even complicated logic expressed in VCL has little overall impact on Varnish performance.</p>
<p>Varnish runs with a simple, functional default VCL configuration, but you may add to the configuration by providing your own VCL file like so: <code>-f /etc/varnish/default.vcl</code></p>
<p>In the sample command line, above, Varnish is given a single back end to proxy to. However, backends can be defined in a VCL file, and when doing so, additional information about the behavior of a backend can be encoded.</p>
<pre>backend fast {
  .host = "fasthost.mydomain.com";
  .port = "http";
  .connect_timeout = 1s;
  .first_byte_timeout = 2s;
  .between_bytes_timeout = 1s;
  .probe = {
    .url = "/ping";
    .timeout = 1s;
    .window = 4;
    .threshold = 4;
  }
}
&nbsp;
backend slow {
  .host = "slowhost.mydomain.com";
  .port = "http";
  .connect_timeout = 6s;
  .first_byte_timeout = 8s;
  .between_bytes_timeout = 3s;
  .probe = {
    .request ==
      "GET /ping HTTP/1.1"
      "Host: pinghost.mydomain.com"
      "X-Ping: true"
      "Connection: close";
    .timeout = 5s;
  }
}
</pre>
<p>Take note of those <code>.probe</code> sections. These are completely option, but if provided, Varnish will use them to perform health checks on the backend. The <code>.window</code> and <code>.threshold</code> options can be used to provide a health tolerance. That is, given a certain number of checks (the <code>.window</code>), how many have to have been successful (the <code>.threshold</code>) for the backend to be considered healthy.</p>
<p>Varnish also has some support for load balancing between backends. It currently only supports round-robin and random selection, but this behavior can be controlled via VCL, as well.</p>
<pre>director plump robin {
  { .backend = fast; }
  { .backend = slow; }
  /&#42; Yep, you can define them inline, too &#42;/
  {
    .backend = {
      .host = "alternate.mydomain.com";
      .port = "8080";
    }
  }
}</pre>
<p>or</p>
<pre>director grasshopper random {
  .retries = 3;
  {
    .backend = fast;
    .weight = 9;
  }
  {
    .backend = slow;
    .weight = 1;
  }
}</pre>
<p>As additional features, you can define both access control lists and grace periods for cached content in Varnish. A grace period is simply a period of time after an object in the cache has expired during which it can still be returned in response to a request. You'd use this if there are objects in the cache that take a long time to generate, in order to avoid having a bunch of requests piling up waiting for the generation of the new object.</p>
<p>These parts of VCL are just scratching the surface of the power of VCL, though. VCL offers the ability to define subroutines for grouping your VCL code, several useful built in functions for regular expression matching and cache manipulation, and a whole host of built in subroutines which serve as hooks into the entire request/response cycle for Varnish, allowing you to customize any point of that cycle.</p>
<p>For example, let's say that you are using a round robin director to load balance between backends, and you want to be sure that if a request for a given resource from <em>one</em> backend fails, no more attempts to that backend, for that resource, are made for a short period of time, to allow it to recover from whatever problem it is having. You can do that with VCL.</p>
<pre>sub vcl_recv {
  set req.grace = 60s;
}
&nbsp;
sub vcl_fetch {
  if (beresp.status == 500) {
    set beresp.saintmode = 15s;
    restart;
  }
  set beresp.grace = 60s;
}
</pre>
<p>As another example, consider the case from my <a href="http://www.engineyard.com/blog/2010/architecture-wins-varnish-and-more/">last post</a>, where I used a Ruby proxy to cache content from Redmine. Redmine isn't particularly cache friendly, returning cache control headers that normally don't allow any caching of content. If you wanted to, though, you could make Varnish do it, using VCL.</p>
<pre>sub vcl_fetch {
  /&#42; This just says that no matter what those cache control headers are saying, &#42;/
  /&#42; insert the content into the cache with a TTL of 60s &#42;/
  if (obj.ttl &lt; 60s) {
    set obj.ttl = 60s;
  }
}
&nbsp;
sub vcl_hash {
  /&#42; This causes varnish to use the cookie contents as part &#42;/
  /&#42; of the key for storing and looking up content from the &#42;/
  /&#42; cache. For Redmine, this would mean per-user contents &#42;/
  /&#42; in the Varnish cache. &#42;/
  set req.hash += req.http.cookie;
}
</pre>
<p>That's pretty nice. A few lines, and the behavior of the cache is significantly customized. However, remember earlier in the article when I mentioned that VCL is translated into C code and compiled on the fly into a dynamically linked shared object? Well, this means that you can embed arbitrary C code into your VCL:</p>
<pre>C{
  #include
  #include
}C
&nbsp;
sub vcl_mylibstuff {
  C{
    mylib_superfunction(VRT_r_req_request(sp);
  }C
}
</pre>
<p>The capabilities of VCL are far too expansive to cover well in a short post, but this should give you a taste of how flexible and powerful Varnish is. The Varnish wiki has expanded documentation (which should continue to improve) and a number of examples of using VCL to do useful real world work. Varnish is a true power tool for caching. Check it out, and leave questions and comments here!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/varnish-its-not-just-for-wood-anymore/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Ruby Scales, AND It&#8217;s Fast &#8211; If You Do It Right!</title>
		<link>http://www.engineyard.com/blog/2010/architecture-wins-varnish-and-more/</link>
		<comments>http://www.engineyard.com/blog/2010/architecture-wins-varnish-and-more/#comments</comments>
		<pubDate>Fri, 19 Mar 2010 21:00:17 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Tips & Tricks]]></category>
		<category><![CDATA[Varnish]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=3561</guid>
		<description><![CDATA[<blockquote><p>"Why does everybody say that CPUs are fast nowadays and that 'it doesn't matter that language XYZ is slow'?</p>
<p>It <em>does</em> matter: web applications. If your applications can't serve all the visitors, then you're going to lose your customer or you'll have to learn some other language with better performance.</p>
<p>Once our application serves 200 million page views each day... the languange is really sensitive, so we go with C/C++."</p>
<p>—ruby-talk Thread</p></blockquote>
<p>Performance: it's a topic that comes up over and over again in the Ruby world, and everyone's got an opinion. Unfortunately, those opinions often focus on minutia, and tend to miss the big picture.</p>
<p>On top of that, things in the Ruby world are far more complex, today, when discussing performance, because one really has to talk about Ruby performance in the context of a specific implementation. Are we talking about Matz Ruby 1.8.x, or 1.9.x?  Are we talking about <a href="http://rubini.us/">Rubinius</a> or <a href="http://jruby.org/">JRuby</a>? What about <a href="http://www.macruby.org/">MacRuby</a>? <a href="http://www.ironruby.net/">IronRuby</a>? <a href="http://maglev.gemstone.com/">MagLev</a>? Every one of these has a different performance profile and level of completeness.</p>
<p>For the purposes of this post, and for the purposes of the attention I paid to the two quotes above, I'm going to focus on Matz' Ruby 1.8.x (MRI). It's been <em>the</em> Ruby for many years, and it's what most people are pointing at when they complain about Ruby being slow. Don't just take my word for it though—check out <a href="http://shootout.alioth.debian.org/">The Computer Language Benchmarks Game</a> for a substantial set of flawed micro-benchmarks using a plethora of different languages. What they call "Ruby MRI" is, at this time, ruby 1.8.7 (2009-06-12 patchlevel 174). It's not even close to being the most recent version of 1.8.7, but that's OK. The benchmarks there have to be taken with a couple grains of salt, anyway.</p>
<div style="float: right; overflow: hidden; border: 1px solid #000; text-align: center; clear: right; font-size: 75%; color: #666666; padding: 0px; margin: 0px 0px 10px 10px;"><a href="http://shootout.alioth.debian.org/"><img style="margin: 0; padding: 0px;" src="http://eyweb-images.s3.amazonaws.com/blog_benchmarks.jpg" alt="" width="243" height="126" /></a></div>
<p>Here's why: Micro-benchmarks for languages have only a weak relationship to the performance of complex systems implemented in those languages, even when implemented well. Or, to put it another way, the speed at which a language can complete a simple, discrete task, is not necessarily a strong predictor of how fast a complicated application, composed of many tasks, will perform when <em>implemented</em> in that language. There are other factors which come into play that can strongly influence overall performance; factors like application architecture, and the ability to leverage higher-level built in capabilities, that simplify things which may be complex to implement in other languages.</p>
<p>Many of you probably know people who claim Ruby can't scale, or is too slow for business-critical web applications. Since you're reading this, you also know those people are wrong. In fact, it's usually far easier to scale a Rails application's web-facing aspect than it is to scale the data storage parts of the application. Nonetheless, scaling that web-facing aspect has costs, and if your application can return content to your customers more efficiently, reducing your hardware needs, you reduce your costs.</p>
<p>Returning to the ruby-talk thread that those quotes came from, my response included an assertion that I thought I could spin up a single <a href="http://www.engineyard.com/products/cloud">Engine Yard Cloud</a> instance, and that running it with an all Ruby stack, I could push 200,000,000 requests through it in less than a day. When I say an all Ruby stack, I'm not talking about the database layer, but rather, the application and anything above it (such as the web server). I wouldn't use Apache, nginx, or any other non-Ruby web server, and I'd use a real, complex application.</p>
<p>Since I already had a 64bit, 4ECU instance running that I use for testing Ruby 1.8.6 changes, I just used that existing instance.  I used Ruby 1.8.6 pl287 for this. I could've used use any version, as <a href="http://rvm.beginrescueend.com/">RVM</a> makes it simple to pick and choose, but that I selected that one because many sites have run on it for a long time (though if you are running on it now, you really should <a href="ftp://ruby-lang.org/pub/ruby/">upgrade</a>), and by being a less than current version, it serves my point well.</p>
<p>For generating test traffic, I used the venerable <a href="http://httpd.apache.org/docs/2.0/programs/ab.html">Apache Bench</a>. Even after all these years it's still got some buggy corner cases, but it's straightforward and easy to use, and it's own performance is high enough that it takes some pretty fast test subjects before you start running into the performance limitations of the tool, instead of the test subjects. I ran it on the same machine as my application's stack because I wanted to eliminate the network as a factor in results, and just feed as many requests to my stack as quickly as possible.</p>
<p><strong><!-- more --></strong></p>
<p>The test application was <a href="http://www.redmine.org/">Redmine</a>, version 0.8.7. I selected Redmine because it's a complex application familiar to many people, and it's easy to install. It's also not yet optimized for speed. Development has been far more focused on features and function than on optimizing for resource usage efficiency. The Rails version that I used is 2.3.2.</p>
<p>So, after installing and configuring Redmine, I started it:</p>
<p><code>ruby script/server -e production -d</code></p>
<p>Note that I did not use <a href="http://mongrel.rubyforge.org/">Mongrel</a>, <a href="http://github.com/wyhaines/swiftiply/blob/master/src/swiftcore/evented_mongrel.rb">evented_mongrel</a>, <a href="http://code.macournoyer.com/thin/">Thin</a>, or <a href="http://unicorn.bogomips.org/">anything</a> else <a href="http://rainbows.bogomips.org">sophisticated</a> as the container for the application. It was just <a href="http://www.ruby-doc.org/stdlib/libdoc/webrick/rdoc/">webrick</a>, and it was just a single instance of webrick.</p>
<p>I then threw some random data into it just so that there was something other than the empty pages. So, let's see how it performed!</p>
<p><code>ab -n 10000 -c 1 http://127.0.0.1:3000/</code></p>
<p>Hmmm. I rode my exercise bike 1.3 miles while that ran... That didn't feel fast at <em>all</em>.</p>
<pre escaped="true">Requests per second:    33.98 [#/sec] (mean)
Time per request:       29.432 [ms] (mean)
Time per request:       29.432 [ms] (mean, across all concurrent requests)</pre>
<p>OK. I mean, that's not <em>horrible</em>. Redmine isn't a lightweight app, and that's over 2.5 million requests a day on a single process. What happens if there's some concurrency?</p>
<pre escaped="true">ab -n 10000 -c 25 http://127.0.0.1:3000/
Requests per second:    31.11 [#/sec] (mean)
Time per request:       803.707 [ms] (mean)
Time per request:       32.148 [ms] (mean, across all concurrent requests)</pre>
<p>That was a 1.4 mile benchmark ride. Shoot; does that mean Ruby really <em>is</em> slow? That did <em>not</em> go in the direction we need, and let's be real here: in a real application deployment, there are going to be concurrent requests—many of them, if you're at all successful. It's pretty clear what direction everything was moving in, but I wanted to take it one step further.</p>
<pre escaped="true">ab -n 10000 -c 500 http://127.0.0.1:3000/
Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
apr_socket_recv: Connection reset by peer (104)</pre>
<p>Well, good to know. Clearly, Redmine running inside of webrick can scale, but there are limits that aren't too hard to hit on a single process. If we were spreading these requests over multiple processes on multiple instances, we could reasonably scale to many millions of requests per day, even running our code on webrick, assuming that the database layer could keep up with all of that. However, that's still a long way from two hundred million requests per day.</p>
<p>Even if we were running on a Ruby implementation that was 2x as fast, or 5x as fast, and even if the application were running in a faster container, the basic problem is still the same—we'd have to throw hardware at it until the problem went away. Even if you spent a lot of time laboriously building Redmine in C++ while focusing on performance, you still wouldn't escape the need, with this simple architecture, to throw hardware at the problem. So, what do you do if you need more throughput out of your application, but aren't excited about adding more hardware resources?</p>
<p>Consider these runs:</p>
<pre escaped="true">ab -n 10000 -c 1 -C '_redmine_session=9ec759408f1ae3c6f919e50baba5a3dc; path=/' http://127.0.0.1/
Requests per second:    2839.37 [#/sec] (mean)
Time per request:       0.352 [ms] (mean)
Time per request:       0.352 [ms] (mean, across all concurrent requests)</pre>
<pre escaped="true">ab -n 10000 -c 1000 -C '_redmine_session=9ec759408f1ae3c6f919e50baba5a3dc; path=/' http://127.0.0.1/
Requests per second:    3862.33 [#/sec] (mean)
Time per request:       258.911 [ms] (mean)
Time per request:       0.259 [ms] (mean, across all concurrent requests)</pre>
<pre escaped="true">ab -n 100000 -c 25 -k  -C '_redmine_session=9ec759408f1ae3c6f919e50baba5a3dc; path=/' http://127.0.0.1/
Requests per second:    7797.39 [#/sec] (mean)
Time per request:       3.206 [ms] (mean)
Time per request:       0.128 [ms] (mean, across all concurrent requests)</pre>
<p>I barely had time to turn the cranks on the exercise bike for those runs! It turns out that to get that performance, I needed to look at my architecture and rethink how I was positioning my application's web facing aspect. Most applications, even highly dynamic ones, show lots of the same stuff to the users. In many cases completely identical content is being displayed for many different users. It's senseless to regenerate this content over and over again. This is where caching enters the architecture picture.</p>
<p>Rails 2 has some built in support for caching. It'll do page caching, which basically writes a static copy of a dynamically generated page to a persistent location, so that on subsequent hits the web server can deliver the page. This works great, but it has limitations.</p>
<p>All content, for everyone, for a given URL must be identical, and you're responsible for providing a sweeper that clears old content. Also, requests will still fall down to your web server, which may mean that you still encounter some significant performance penalties when delivering your content in some situations. For example, nginx delivers static files quite quickly <em>if</em> it's sitting on top of a fast disk. Sit it on a slow disk, though, and page caching returns limited dividends. If it can work for your application though, use it.</p>
<p>Rails also supports partial caching in some different guises—to the file system, to memory, to memcached, etc. Partial caching can be a win <em>architecturally</em>, because it bypasses all of the heavy work involved in generating content; your app can just assemble pregenerated fragments into a complete page. If you haven't done so, look into that as well. It can be very helpful.</p>
<p>Along those same conceptual lines, there's also <a href="http://en.wikipedia.org/wiki/Edge_Side_Includes">edge side includes, or ESI</a>. ESI essentially lets one's application return a skeleton of a page, or an incomplete page with some special markup embedded. The proxy that receives that content, and that understands ESI markup can then insert content, either from its own cache, or from a subrequest that it issues to some other URL.</p>
<p>This lets a proxy cache a generated, but incomplete page, yet still fill it out with smaller pieces of dynamically generated content without pushing all of that work back into the dynamic application. So it's a bit like partial caching, but it's handled at a shallower level in the stack. I've heard that Rails 3 will have a plugin to facilitate the use of ESI, and that it may come built in with a later dot release. Not all reverse caching proxies support ESI, but many of them do.</p>
<p>For Redmine, page caching doesn't work very well. It, like many applications, uses cookies. Applications can use cookies to identify users, to handle authentication, or to persist data on the user's browser, instead of on the server. When an application needs to deliver cookies in addition to content, simple page caching won't work. Redmine falls into this category. And besides... I promised to use a Ruby stack, so leveraging Nginx or Apache to serve files from a page cache would be cheating.</p>
<p>What I really needed was a caching reverse proxy that would sit in front of the application. It had to be smart enough to do the right thing with regard to caching content that has cookies attached (at least for some definition of <em>the right thing</em>), and it had to be stubborn enough to not-quite follow the <code>Cache-Control</code> headers that Redmine set. It needed to be implemented in Ruby, and it be fast enough to be worthwhile.</p>
<p>Most caching reverse proxies are implemented in <em>fast</em> languages. <a href="http://varnish-cache.org/">Varnish</a>, one of the fastest caching reverse proxies, is written in C. <a href="http://wiki.nginx.org/Main">Nginx</a> , which can be configured to provide a caching reverse proxy, is also implemented with C, as is <a href="http://www.squid-cache.org/">Squid</a>, one of the oldest proxy servers. <a href="http://cwiki.apache.org/TS/traffic-server.html">Traffic Server</a> is implemented with C++.</p>
<p>Refer back to the benchmarks site. <a href="http://shootout.alioth.debian.org/u32/benchmark.php?test=all&amp;lang=ruby&amp;lang2=gcc">C</a> is a lot faster than MRI Ruby. <a href="http://shootout.alioth.debian.org/u32/benchmark.php?test=all&amp;lang=ruby&amp;lang2=gpp">C++</a> is significantly faster, too. So, to borrow a phrase from my grandmother, how on God's green Earth do I expect to write a proxy in Ruby that can compete with one in a language that benchmarks 100x-200x faster than it is?</p>
<p>Bullheaded stubborness in the face of ignorance? Well, yes, a little bit, combined with some specific architectural decisions. Most of those proxies try to do everything. I think there are probably configuration options in Squid that would get it to cook breakfast for me. Traffic Server probably won't cook breakfast for me, yet, but it will make the bed, and somewhere in the TODO, I'm sure they have plans to allow for it to make breakfast, too, if you can figure out how to configure it. Varnish is one of the fastest proxies, and it gets its speed, in large part, because it won't make the bed or cook my breakfast. It's like Charles Emerson Winchester III from M.A.S.H., <em>"I do one thing at a time, I do it very well, and then I move on."</em> Varnish does still take some configuration eduction to get it to work well, though.</p>
<p>And that is the secret to keeping things fast. Or, at least one of the secrets, anyway. I took it one step further. My approach was:</p>
<blockquote><p>Do one thing at a time, do it well enough, and then move on.</p></blockquote>
<p>A couple of years ago I wrote a very fast proxy and simple web server in Ruby that I called <a href="http://github.com/wyhaines/swiftiply">Swiftiply</a>. It leverages <a href="http://github.com/eventmachine/eventmachine">EventMachine</a> for handling network traffic, and then tries to squeeze the rest of the performance that it needs out of Ruby by not providing any more capability than is really needed to get the job done. <a href="http://www.engineyard.com/blog/author/ezra/">Someone</a> once said that "No code is faster than no code."</p>
<p>Swiftiply didn't provide enough capability for a caching reverse proxy, but it did have the capability to serve and cache static assets very quickly (on a lot of hardware my benchmarking efforts have run up against Apache Bench's own performance limits), and it did already function as a proxy, so much of the capability was there. One advantage to it being written in Ruby was that it was relatively straightforward for me to add additional capability to it. So I did.</p>
<p>To really handle Redmine properly requires the ability to cache different versions of the same URL, where the only differentiator is the cookies. Also, Redmine sets a Cache-Control header that looks like this:</p>
<p><code>Cache-Control: private, max-age=0, must-revalidate</code></p>
<p>Without digging into it deeply, this means that public caches should not cache the content, and private caches need to confirm with the server that it has valid content before using it. But we want to ignore that (unless Cache-Control is set to no-cache, in which case we'll pay attention), because we do want to keep private content cached, and we do not want to have to always go back to the application to revalidate on every request. My assumption is that it is OK if, for example, a new issue is added, but it takes a few seconds before a url which shows the issues is refreshed to display that new issue.</p>
<p>The end result is a caching reverse proxy with very few tuning knobs, and behavior that's not quite HTTP 1.1 correct, but that is very fast, stable, and hackable. It's probably not actually as fast as it <em>could</em> be, since I piggy backed the implementation onto something that's doing more than I really need, but it's good enough. Ruby, as a "slow" language, delivers on something that runs very fast and is good enough for the goal that I had.</p>
<p>If you're wondering how many requests were pushed through my Ruby stack in 24 hours:</p>
<pre escaped="true">Requests per second:    3283.09 [#/sec] (mean)</pre>
<p>That's 283,659,084 requests in 24 hours (and none of them were keepalive requests). All handled in a Ruby stack. All with a completely browseable and useable Redmine installation that was still responsive while the test was running; I added issues, edited them, removed them, and did administrative actions with no perceptible delays.</p>
<p>I readily admit that this isn't a test that faithfully simulates real production loads; you probably aren't going to roll out a production web app servicing two or three hundred million requests a day on a single modestly sized EY Cloud instance. But if you were doing something that wasn't going to be bottlenecked by the data store, you just might be able to do it, all with slow, slow Ruby. Not bad.</p>
<p>It's no Varnish, and it never will be. Varnish does far more, more correctly, and all a little bit faster. Varnish also requires some careful tuning to run well, and is not nearly so hackable— so there are tradeoffs. If you neede more performance out of your application, look closely at what a caching reverse proxy can do for you. In the larger view of your application's deployment architecture, it can make a tremendous difference in your users' experience. Varnish is a great piece of software, and deserves a post of its own covering configuration and usage.</p>
<p>And if you truly find that you need some specialized capability, don't be afraid to spike something out with Ruby. Paying a little attention to writing lean code that delivers just the capabilities that you need can result in surprisingly fast, capable code, even in a slow implementation of a slow language like Ruby ;)</p>
<p>Questions and comments welcome!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2010/architecture-wins-varnish-and-more/feed/</wfw:commentRss>
		<slash:comments>63</slash:comments>
		</item>
		<item>
		<title>Key-Value Stores in Ruby: The Wrap Up</title>
		<link>http://www.engineyard.com/blog/2009/key-value-stores-in-ruby-the-wrap-up/</link>
		<comments>http://www.engineyard.com/blog/2009/key-value-stores-in-ruby-the-wrap-up/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 18:00:17 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[couchdb]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Key-Value Stores]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[S3]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2541</guid>
		<description><![CDATA[<p>This last article in <a href="http://www.engineyard.com/blog/2009/key-value-stores-in-ruby/">our key-value series</a> will briefly cover a few interesting topics that could each have had full articles of their own. This means that if they seem interesting to you, follow the links that I provide to get more information on them. Lastly, I'll wrap up by introducing <a href="http://github.com/wycats/moneta/">Moneta</a>, written by Yehuda Katz, which provides a unified API for a wide variety of different Key-Value Stores. If you want to write code that allows the user to choose the store to use, you'll want to pay attention to Moneta.</p>
<p>The difficult part of discussing Key-Value Stores stores <em>today</em> is that it's a product area seeing rapid development and constant evolution. There are more interesting stores and libraries available than can easily be covered, even in a series like this. I could probably be writing posts every two weeks into next year without running out of subjects. So, alas, many things must be left <em>un</em>discussed or <em>under</em>discussed. But let's move on to the topics we <em>can</em> cover...</p>
<h2>CouchDB</h2>
<p>The first great Key-Value Store that isn't going to get its own article is <a href="http://couchdb.apache.org/">CouchDB</a>.  Apache's CouchDB is a document-oriented database, like <a href="http://www.engineyard.com/blog/2009/mongodb-a-light-in-the-darkness-key-value-stores-part-5/">MongoDB</a>.  It, however, exposes a RESTful JSON based API that you address with a built in HTTP interface. Like MongoDB, it offers a schema free data store. CouchDB offers solid, built-in replication, and uses JavaScript as its query language. It is a powerful tool.</p>
<p>There are several Ruby libraries which can be used to facilitate using CouchDB. In the examples below, I have used <a href="http://github.com/jchris/couchrest">CouchRest</a>, which is based on CouchDB's own <a href="http://svn.apache.org/repos/asf/incubator/couchdb/trunk/share/www/script/couch.js">couch.js</a> library:</p>
<pre lang="ruby">require 'rubygems'
require 'couchrest'
require 'yaml'

DBH = CouchRest.database!('exercise-log')

response = DBH.save_doc({
  :date => Time.now,
  :activity => ARGV[0],
  :duration => ARGV[1]})

stored_record = DBH.get(response['id'])
puts "Stored:\n#{stored_record.to_yaml}"</pre>
<pre lang="shell">wyhaines$ ruby /tmp/couch1.rb
Stored:
--- !map:CouchRest::Document
duration: "97:34"
_rev: 1-eb6f6e3a3e2eae0cd99f3fcbc63d29d6
_id: 0d9e71f44b3e0d3a2013c282bbccb5a0
activity: pedaling
date: 2009/11/12 21:07:45 +0000</pre>
<p>Like MongdoDB, one can store any set of keys/values together as a document in CouchDB, and then retrieve it later.  CouchRest returns a response from the server that contains an <code>id</code> field, which can be used to retrieve the record that was just stored.</p>
<p>For more complex queries of the document store, one can use views.  Views have a lot of power, because they are ultimately defined using JavaScript, but they don't lend themselves to easy ad-hoc manipulation of the database.</p>
<pre lang="ruby">DBH.save_doc({
  "_id" => "_design/query",
  :views => {
    :allkeys => {
      :map => "function(doc) { " \
              "for (var word in doc) { " \
              "if (!word.match(/^_/)) emit(word,doc[word])}}"
    }
  }
})</pre>
<p>That inserts a view into the database that will be identified by <code>query/allkeys</code>.  What a view does is defined by the JavaScript code  it contains.  Once a view is inserted into CouchDB, using it is simple:</p>
<pre lang="ruby">puts DBH.view('query/allkeys').to_yaml</pre>
<p>That particular function was lifted shamelessly from the CouchRest README, and just has a couple terms renamed to make it a little more clear. The output:</p>
<pre lang="yaml">---
total_rows: 3
rows:
- id: 0d9e71f44b3e0d3a2013c282bbccb5a0
  value: pedaling
  key: activity
- id: 0d9e71f44b3e0d3a2013c282bbccb5a0
  value: 2009/11/12 21:07:45 +0000
  key: date
- id: 0d9e71f44b3e0d3a2013c282bbccb5a0
  value: "97:34"
  key: duration
offset: 0</pre>
<p>This is really just the tip of the iceberg with CouchDB/CouchRest; there's a wealth of functionality. CouchDB views are implemented with map/reduce capability, which means you can use them to crunch some pretty complex problems on your data. Additionally, CouchRest provides a <code>CouchRest::ExtendedDocument</code>, which your own classes can inherit from. This lets you  easily create a Ruby model for your data, which is then transparently stored inside CouchDB.</p>
<pre lang="ruby">class Exercise  "running", :date => Time.now, :duration => "23:44")</pre>
<p>Dig into the CouchDB and CouchRest documentation if this looks interesting to you.</p>
<h2>S3</h2>
<p>I just wanted to briefly mention <a href="http://aws.amazon.com/s3/">Amazon's Simple Storage Service</a>. It is, fundamentally, a simple HTTP accessible Key-Value Store that Amazon has turned into a service.  Requests to S3 will have higher latency than requests to a locally hosted data store (and its <a href="http://www.engineyard.com/blog/2009/rails-in-the-wild-5-client-side-performance-observations/">response latency can be high too</a>), but if you want a simple, robust store that will scale to as much data as you have to push at it, you might seriously consider S3.</p>
<h2>Moneta</h2>
<p><a href="http://github.com/wycats/moneta/">Moneta</a> is a unified interface to a variety of different key-value type data stores. That is, the same code can be run against a variety of different backing stores, and it will just work. Moneta supports the following stores as of this posting:</p>
<ul>
<li>Basic File Store</li>
<li>BerkeleyDB</li>
<li>CouchDB</li>
<li>DataMapper</li>
<li>File store for xattr</li>
<li>In-memory store</li>
<li>Memcache store</li>
<li>Redis</li>
<li>S3</li>
<li>SDBM</li>
<li>Tokyo</li>
<li>Xattrs in a file system</li>
</ul>
<p>Consider this example, which, again, uses CouchDB:</p>
<pre lang="ruby">
require 'moneta/couch'
require 'rubygems'
require 'yaml'
require 'moneta'
require 'moneta/couch'

cache = Moneta::Couch.new(:db => 'football')

cache['1a_final'] = {
  :where => 'Laramie; War Memorial Stadium',
  :when => "11:30 MST",
  :who => "Southeast Cyclones &amp; Lingle-Ft. Laramie Doggers",
  :prediction => "SE Cyclones by 14"}

puts cache['1a_final'].inspect</pre>
<pre lang="bash">wyhaines$ ruby /tmp/moneta1.rb
---
- prediction: SE Cyclones by 14
  when: 11:30 MST
  who: Southeast Cyclones &amp; Lingle-Ft. Laramie Doggers
  where: Laramie; War Memorial Stadium</pre>
<p>It works, very simply.  If I want to change the code to use something else, like a file based store, it's as simple as changing one line:</p>
<pre lang="diff">--- couch.rb    2009-11-19 15:00:07.000000000 -0700
+++ file.rb     2009-11-19 15:01:12.000000000 -0700
@@ -1,9 +1,9 @@
 require 'rubygems'
 require 'yaml'
 require 'moneta'
-require 'moneta/couch'
+require 'moneta/file'

-cache = Moneta::Couch.new(:db => 'football')
+cache = Moneta::File.new(:path => '/tmp/football')

 cache['1a_final'] = {
   :where => 'Laramie; War Memorial Stadium',</pre>
<p>The rest of the code works without alteration.  The Moneta API is designed to be very similar to that of <code>Hash</code>.  It has a limited feature set, but the features it provides work identically across all of the supported platforms. For example, it doesn't currently support iteration or partial matches. If your Key-Value Store needs are simple and you want something that can work with whatever store your <em>users</em> want to use, definitely check out Moneta; it's a well written tool.</p>
<p>With that, we've reached the end of this series. It's been fun to explore the unique features, as well as the threads that unify each of these different approaches to the problem, on a non-SQL key-value type data store. I hope that I've exposed you to new and useful tools.</p>
<p>The landscape of Key-Value Stores is changing rapidly, so it is difficult to stay fully informed all the time. For instance, just a couple days ago there was a blog post implementing a <a href="http://legitimatesounding.com/blog/NoSQL_meet_SQL.html">SQL front end for CouchDB</a>. It's done in Perl, but all it would take is an interested person and a little time, and you could have it in Ruby, too.</p>
<p>If you use a Key-Value Store system, or plan to, keep your eyes open for new developments, because you can bet that someone else will have something interesting next week or next month that may change the landscape again. As always, leave feedback in the comments, and thanks for reading!
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/key-value-stores-in-ruby-the-wrap-up/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>MongoDB: A Light in the Darkness! (Key Value Stores Part 5)</title>
		<link>http://www.engineyard.com/blog/2009/mongodb-a-light-in-the-darkness-key-value-stores-part-5/</link>
		<comments>http://www.engineyard.com/blog/2009/mongodb-a-light-in-the-darkness-key-value-stores-part-5/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 18:00:47 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Database Sharding]]></category>
		<category><![CDATA[Key-Value Stores]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[Redis]]></category>
		<category><![CDATA[Tokyo Cabinet]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2393</guid>
		<description><![CDATA[<p>The universe was dark and chaotic. Bits of broken matter swirled everywhere, illuminated by flashes of explosive light, and the rare gleam of something brighter and more persistent. Those bright lights of persistence always seemed to be shrouded in a miasma of cosmic dust. Then it happened!</p>
<p>A twist of gravimetric interplay pulled two of these lights towards each other, where they swirled and danced for a time prior to crashing into each other.  That cosmic convergence showered the surrounding space with illumination as the resulting maelstrom of persistence coalesced towards stability and slashed through the miasma, shining a new light on the cosmos. That new light of persistence was good, and was called MongoDB.</p>
<p>MongoDB can be thought of as the goodness that erupts when a traditional key-value store collides with a relational database management system, mixing their essences into something that's not quite either, but rather something novel and fascinating.</p>
<p>MongoDB is a document-oriented database. If you haven't used one before, that may sound strange, but it's really pretty simple. A document is a set of keys and values that, together, represent a larger set of data. Conceptually, it's a lot like a table with a free form schema. If you have used Tokyo Cabinet tables, they are functionally similar. It's a very useful paradigm because it allows you to store and then access your data in a simple, direct, and flexible way.</p>
<p>Installing MongoDB is simple.  Just hit <a href="http://www.mongodb.org/display/DOCS/Downloads">http://www.mongodb.org/display/DOCS/Downloads</a>, and download the appropriate package for your platform. Then:</p>
<pre lang="bash">    mkdir -p /data/db
    tar -xvzf PACKAGE

    ./mongodb-xxxxxxx/bin/mongod &amp;</pre>
<p>At that point, you have a running instance of MongoDB. Now, try a simple interaction with it:</p>
<pre lang="bash">    ./mongodb-xxxxxxx/bin/mongo

     > db.foo.save( { a : 1 } )
     > db.foo.findOne()</pre>
<p>Awesome! You're off to the races.</p>
<p>MongoDB support is available in many languages, making it a good choice for a system that has to work in a polyglot environment; all of the major languages have support.  The Ruby package is a gem known as <em>mongodb-mongo</em>. To install it, first make sure rubygems knows that gems.github.com is a valid source for gems: <code>gem source --list</code></p>
<p>Add gems.github.com if it isn't shown in that list: <code>gem source --add http://gems.github.com</code></p>
<p>Then install: <code>gem install mongodb-mongo</code></p>
<p>Or, if you want to install the version that uses a C extension for better performance: <code>gem install mongodb-mongo_ext</code></p>
<h2>Using MongoDB is Simple<span id="more-2393"></span></h2>
<pre lang="irb">
>> require 'rubygems'; require 'mongo'
=> true
>> include XGen::Mongo::Driver
=> Object
>> db = Mongo.new.db('finance')
=> #<xgen::mongo::driver::db:0x2a98da7038 @socket=#<TCPSocket:0x2a98da5be8>, @port=27017,
   @auto_reconnect=nil, @semaphore=#<object:0x2a98da6ed0 @mu_waiting=[], @mu_locked=false>, @name="finance",
   @nodes=[["localhost", 27017]], @host="localhost",
   @strict=nil, @pk_factory=nil, @slave_ok=nil>

>> collection = db.collection('stocks')
=> #<xgen::mongo::driver::collection:0x2a98d94208 @name="stocks", @hint=nil,
   @db=#<XGen::Mongo::Driver::DB:0x2a98da7038
   @socket=#<TCPSocket:0x2a98da5be8>, @port=27017,
   @auto_reconnect=nil, @semaphore=#<object:0x2a98da6ed0 @mu_waiting=[], @mu_locked=false>, @name="finance",
   @nodes=[["localhost", 27017]], @host="localhost",
   @strict=nil, @pk_factory=nil, @slave_ok=nil>>

>> stock = {'ticker' => 'GOOG',
>> 'Google Inc.',
>> '38259P508',
>> 'http://www.google.com/finance?q=goog'}
=> {"reference"=>"http://www.google.com/finance?q=goog",
    "name"=>"Google Inc.", "cusip"=>"38259P508", "ticker"=>"GOOG"}

>> collection.insert stock
=> {"reference"=>"http://www.google.com/finance?q=goog",
    "name"=>"Google Inc.", "cusip"=>"38259P508", "ticker"=>"GOOG"}</object:0x2a98da6ed0></xgen::mongo::driver::collection:0x2a98d94208></object:0x2a98da6ed0></xgen::mongo::driver::db:0x2a98da7038></pre>
<p>That's all there is to it.  Just insert your hash representation of your document, and it'll be stored for you.  To retrieve one or more documents, use the <code>#find</code> method:</p>
<pre lang="irb">
>> cursor = collection.find('ticker' => 'GOOG')
=> #<xgen::mongo::driver::cursor:0x2a98d28940 @closed=false,
   @query=#<XGen::Mongo::Driver::Query:0x2a98d28af8
   @order_by=nil, @fields=nil, @number_to_return=0,
   @selector={"ticker"=>"GOOG"}, @hint=nil,
   @number_to_skip=0, @explain=nil>, @rows=nil, @cache=[],
   @query_run=false, @num_to_return=0, @can_call_to_a=true,
   @db=#<xgen::mongo::driver::db:0x2a98da7038 @socket=#<TCPSocket:0x2a98da5be8>, @port=27017,
   @auto_reconnect=nil, @semaphore=#<object:0x2a98da6ed0 @mu_waiting=[], @mu_locked=false>, @name="finance",
   @nodes=[["localhost", 27017]], @host="localhost",
   @strict=nil, @pk_factory=nil, @slave_ok=nil>,
   @collection=#<xgen::mongo::driver::collection:0x2a98d94208 @name="stocks", @hint=nil,
   @db=#<XGen::Mongo::Driver::DB:0x2a98da7038
   @socket=#<TCPSocket:0x2a98da5be8>, @port=27017,
   @auto_reconnect=nil, @semaphore=#<object:0x2a98da6ed0 @mu_waiting=[], @mu_locked=false>, @name="finance",
   @nodes=[["localhost", 27017]], @host="localhost",
   @strict=nil, @pk_factory=nil, @slave_ok=nil>>>

>> cursor.next_object.inspect
=> "{\"_id\"=>#<xgen::mongo::driver::objectid:0x2a98ce9448 @data=[74, 184, 252, 71, 34, 116, 195, 23, 83, 115, 44, 164]>,
   \"reference\"=>\"http://www.google.com/finance?q=goog\",
   \"name\"=>\"Google Inc.\", \"cusip\"=>\"38259P508\",
   \"ticker\"=>\"GOOG\"}"</xgen::mongo::driver::objectid:0x2a98ce9448></object:0x2a98da6ed0></xgen::mongo::driver::collection:0x2a98d94208></object:0x2a98da6ed0></xgen::mongo::driver::db:0x2a98da7038></xgen::mongo::driver::cursor:0x2a98d28940></pre>
<p>As you can see in the example above, <code>#find</code> is simple.  It takes a hash which describes keys to search, and the values in them to search for. It returns a cursor object that can be enumerated in order to retrieve the return results. So in a case where you have many records that were returned as the result of a <code>find</code> operation, you could do something like this:</p>
<pre lang="ruby">
collection.find('price_date' => '2009-09-21').each do |stock|
  # do stuff with stock
end</pre>
<p>If your query should only return a single data item, or you only care about the first of a set of data that might match, you can use <code>#find_first</code>, like this:</p>
<pre lang="irb">
>> collection.find_first('ticker' => 'GOOG')
=> {"_id"=>#<xgen::mongo::driver::objectid:0x2a98ccd090 @data=[74, 184, 252, 71, 34, 116, 195, 23, 83, 115, 44, 164]>,
   "reference"=>"http://www.google.com/finance?q=goog",
   "name"=>"Google Inc.", "cusip"=>"38259P508", "ticker"=>"GOOG"}</xgen::mongo::driver::objectid:0x2a98ccd090></pre>
<p>Notice in the above set of returned data that there is one additional field that is added to the record.  MongoDB reserves all fields that start with the <strong>_</strong> character for internal use.  The <code>_id</code> field is a unique identifier for that row of data.  It receives special indexing and treatment by MongoDB in order to make many db operations more efficient.</p>
<p>So, if you're like me, you're looking at these examples and wondering how you move beyond <code>find_first(FIELD => VALUE)</code>, which is obviously limited to searching only for exact matches. MongoDB has you covered:</p>
<ul>
<li>Boolean searches: <code>collection.find({'price' => {'$gt' => 10.00}})</code></li>
<li>Regular expressions: <code>collection.find({'ticker' => /^MS/})</code></li>
<li>Sets: <code>collection.find({'ticker' => {'$in' => ['GOOG','YHOO']}})</code></li>
<li>Sorting and liming: <code>collection.find({'cusip' => {'$gt' = '580'}}, {:limit => '100', :sort => 'ticker'})</code></li>
</ul>
<p>In this way, MongoDB provides much of the query capability of a SQL database.</p>
<p>While you can query the document store on any key, if there are keys that you expect to be doing a lot of queries with, you should create an index on that key.  Doing so dramatically increases the speed at which the data can be queried, especially when there is a lot of it.  To do so:</p>
<pre lang="irb">
>> collection.create_index('key')
=> "key_1"</pre>
<p>In addition to its key-value-like storage capabilities, MongoDB has one other interesting capability that I want to reveal.  It offers a GridFS storage system that lets people store complete files within the database.  The Ruby library for Mongo that provides access to this capability is called <code>mongo/gridfs</code>.  It essentially permits you to do file IO into and out of a MongoDB database.</p>
<pre lang="irb">
>> require 'rubygems'; require 'mongo'; require 'mongo/gridfs'
=> true
>>  include XGen::Mongo::Driver
=> Object
>> include XGen::Mongo::GridFS
=> Object
>> db = Mongo.new.db('finance')
=> #<xgen::mongo::driver::db:0x2a98d41800 @auto_reconnect=nil, @host="localhost",
   @semaphore=#<Object:0x2a98d416c0 @mu_waiting=[],
   @mu_locked=false>, @name="finance",
   @nodes=[["localhost", 27017]], @strict=nil,
   @pk_factory=nil, @slave_ok=nil,
   @socket=#<tcpsocket:0x2a98d40928>, @port=27017>

>> GridStore.open(db,'testfile','w+') {|fh| fh.puts "This is a test."}
=> nil

>> GridStore.open(db,'testfile','r') {|fh| puts fh.read}
This is a test.
=> nil</tcpsocket:0x2a98d40928></xgen::mongo::driver::db:0x2a98d41800></pre>
<p>As you can see, MongoDB is very easy to use.  It is not a screaming speed demon like a simple key-value store (such as a <a href="http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-4-to-redis-or-not-to-redis/" target="_blank">Redis</a> or <a href="http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-2-tokyo-cabinet/" target="_blank">Tokyo Cabinet</a>), but it performs more than adequately.  On commodity Linux hardware, tests showed about 2,500 simple document insertions per second, and about 2,800 reads per second using the gem without the C extension.</p>
<p>MongoDB does not have any sharding capabilites that are at production quality, but there is now alpha level support for automatic sharding, so it's only a matter of time before MongoDB enters the realm of being a fully production ready, horizontally scalable key-value document store.</p>
<p>MongoDB's charm is that it mixes a very powerful, expansive query model with a free-form key-value-like data store, while still giving adequate performance. It is ideal for storing documents in a database. Query syntax isn't the prettiest thing around, but with an ease of use that rivals that of <a href="http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-4-to-redis-or-not-to-redis/" target="_blank">Redis</a>, MongoDB should be a strong contender if you have complex data storage needs.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/mongodb-a-light-in-the-darkness-key-value-stores-part-5/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>To Redis or Not To Redis? (Key-Value Stores Part 4)</title>
		<link>http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-4-to-redis-or-not-to-redis/</link>
		<comments>http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-4-to-redis-or-not-to-redis/#comments</comments>
		<pubDate>Thu, 10 Sep 2009 17:00:07 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Key-Value Stores]]></category>
		<category><![CDATA[Redis]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2148</guid>
		<description><![CDATA[<p>Welcome to another post in our key value series! This week, Redis! Redis is a persistent in-memory key-value store written in C by Salvatore Sanfilippo. It's currently in version 1.0. So let's get down to it, "To Redis or Not to Redis?" that's the question...</p>
<p>So, let's say you have a situation where...</p>
<ul>
<li>You want a key-value store that's blazingly fast</li>
<li>Your data set is small enough that it can fit in available RAM</li>
<li>It's OK if some recently updated records are lost in a catastrophic failure</li>
<li>Your life would be a lot easier if it was cheap and easy to do set and list operations atomically</li>
</ul>
<p>If this describes your situation, you should take a serious look at <a href="http://code.google.com/p/redis/">Redis</a>. It provides a very fast store in part because it keeps the data set in memory. It handles persistence by asynchronously writing changes after a configurable number of seconds or number of updates have occurred, which means that if the Redis server goes down unexpectedly, it is possible to lose some records. (Redis does offer a master-slave replication mode which mitigates this risk, though).  Finally, Redis provides storage for data structures other than strings.</p>
<p>With Redis, a value can also be a list or a set, and Redis provides atomic operations for manipulating those values. This feature eliminates the need for a lot of potentially troublesome locking antics if you need to maintain consistent lists or sets that are manipulated by multiple clients at the same time.</p>
<p>Furthermore, while Redis doesn't inherently support a sharded, horizontally scalable architecture  <a href="http://www.engineyard.com/blog/2009/cassandra-and-ruby-a-love-affair/" target="_blank">like Cassandra does</a>, some Redis clients, including the <a href="http://github.com/ezmobius/redis-rb/tree/master">Ruby</a> one (by our own Ezra Zygmuntowicz), support consistent hashing and distribution of data across multiple servers. So, at least when using a client library that supports it, like the Ruby library does, Redis offers a compelling combination of performance with scalability.</p>
<p><span id="more-2148"></span></p>
<p>After you've installed Redis and started up an instance of <code>redis-server</code>, you're ready to use it.  If you haven't already, grab Ezra's <a href="http://github.com/ezmobius/redis-rb/tree/master">redis-rb</a> library and install it.</p>
<pre lang="irb">
>> require 'rubygems'; require 'redis'
=> true
>> redis = Redis.new
=> #>Redis:0x2a98943500 @sock=#>TCPSocket:0x2a98943348>,
   @host="127.0.0.1", @logger=nil, @password=nil,
   @timeout=5, @db=0, @port=6379>
>> redis['key'] = 'value'
=> "value"
>> redis['key']
=> "value"</pre>
<p>Functionally, this is a lot more like what you're probably used to when thinking about a key-value store, (versus what you saw with Cassandra's data storage model). Redis does have a concept of multiple databases, where each database is a separate key-value namespace, but Redis keeps it simple. Databases are  numbered simply, starting with 0, and if you don't tell Redis which database you want to use, it assumes you are using database 0.</p>
<pre lang="irb">
>> another_db = Redis.new(:db => 2)
=> #>Redis:0x2a988bbc68 @sock=#>TCPSocket:0x2a988bb920>,
   @host="127.0.0.1", @logger=nil, @password=nil,
   @timeout=5, @db=2, @port=6379>
>> puts another_db
Redis Client connected to 127.0.0.1:6379 against DB 2
=> nil
>> another_db['key'] = 'Altoids FTW!'
=> "Altoids FTW!"
>> redis['key']
=> "value"
>> another_db['key']
=> "Altoids FTW!"</pre>
<p>Redis supports several atomic operations on the data in a database, including moving data from one database to another, as well as incrementing and decrementing values.</p>
<pre lang="irb">>> redis['hits'] = 1
=> 1
>> redis['hits']
=> "1"
>> redis.incr('hits')
=> 2
>> redis.incr('misses')
=> 1</pre>
<p>Notice a few things in the above example: first of all, Redis value data types are either strings, lists, or sets. So, when a numeric 1 was assigned as the value for a key, the client actually stored the <code>to_s</code> version of that value, <code>"1"</code>.</p>
<p>Second, notice that you don't need to initialize a counter before using it. If you reference a key in an increment or decrement operation that doesn't exist, it will be automatically vivified for you. Finally, as mentioned just a moment ago, numbers don't appear anywhere in the list of Redis data types, so the increment/decrement operations work on a simple principle—try to interpret the value as a long, and then work with whatever you get.  So, be careful not to increment or decrement a key that has non-numeric data in it. Rather than throwing some sort of exception, Redis will happily attempt to do what you are asking, and clobber your data along the way.</p>
<pre lang="irb">
>> redis['not_a_counter'] = 'There be kittens!'
=> "There be kittens!"
>> redis.incr('not_a_counter')
=> 1
>> redis['not_a_counter']
=> "1"</pre>
<p>Dealing with a really fast, straightforward key-value store with atomic increment/decrement is pretty useful in itself, but Redis <em>really</em> starts to shine when you look at what can be done with list and set operations. Let's say that you want to keep an audit log of of client sessions in your application. You might start with something like this:</p>
<p class="listing-title">audit_log.rb</p>
<pre lang="ruby">class AuditLog

  def initialize(args)
    @db = args[:db]
    @id = "audit_log_#{args[:id]}"
  end

  def >>(msg)
    @db.push_tail @id, "#{Time.now.to_s}: #{msg}"
  end

  def to_a
    @db.list_range(@id, 0, -1)
  end

  def method_missing(meth, *args)
    @db.send(meth,@id,*args)
  end

end</pre>
<pre lang="irb">
>> require 'rubygems'; require 'redis'; require 'audit_log'
=> true
>> redis = Redis.new
=> #>Redis:0x2a9880acd8 @logger=nil, @host="127.0.0.1",
   @timeout=5, @password=nil, @port=6379, @db=0,
   @sock=#>TCPSocket:0x2a9880abe8>>
>> log = AuditLog.new(:db => redis, :id => 'customer_x')
=> #>AuditLog:0x2a987bd9b0 @id="audit_log_customer_x",
   @db=#>Redis:0x2a9880acd8 @logger=nil, @host="127.0.0.1",
   @timeout=5, @password=nil, @port=6379, @db=0,
   @sock=#>TCPSocket:0x2a9880abe8>>>
>> log >> "opened account"
=> "OK"
>> log >> "saved preferences"
=> "OK"
>> log >> "logout"
=> "OK"
>> log.to_a
=> ["Wed Sep 09 07:59:12 -0500 2009: opened account",
    "Wed Sep 09 07:59:36 -0500 2009: saved preferences",
    "Wed Sep 09 08:00:16 -0500 2009: logout"]
>> log.list_range(1,2)
=> ["Wed Sep 09 07:59:36 -0500 2009: saved preferences",
    "Wed Sep 09 08:00:16 -0500 2009: logout"]</pre>
<p>Sets in Redis are also easy to work with. Just like everything else, there's no special preparation necessary. You just open up the Redis database and start using them. Imagine that you are creating the world's next great dating site. You allow people to enter lists of keywords to describe themselves, and then you use the intersection of these keywords to help determine how well two people match each other.</p>
<p class="listing-title">date_keywords.rb</p>
<pre lang="ruby">class DateKeywords

  attr_reader :keyword_id

  def initialize(args)
    @db = args[:db]
    @keyword_id = "keywords_#{args[:id]}"
  end

  def insert_keyword_set(keywords)
    keywords.each { |word| add_keyword word }
  end

  def add_keyword(keyword)
    @db.set_add @keyword_id, keyword
  end   

  def find_commonalities(potential_date)
    @db.set_intersect @keyword_id, potential_date.keyword_id
  end   

end</pre>
<pre lang="irb">
>> require 'rubygems'; require 'redis'; require 'date_keywords'
=> true

>> redis = Redis.new
=> #>Redis:0x2a9893ae50 @sock=#>TCPSocket:0x2a9893ad60>,
   @host="127.0.0.1", @logger=nil, @password=nil, @timeout=5,
   @db=0, @port=6379>

>> gal_words = DateKeywords.new(:db => redis, :id => 'gal')
=> #>DateKeywords:0x2a988ce200 @keyword_id="keywords_gal",
   @db=#>Redis:0x2a9893ae50 @sock=#>TCPSocket:0x2a9893ad60>,
   @host="127.0.0.1", @logger=nil, @password=nil, @timeout=5,
   @db=0, @port=6379>>

>> guy_words = DateKeywords.new(:db => redis, :id => 'guy')
=> #>DateKeywords:0x2a98894758 @keyword_id="keywords_guy",
   @db=#>Redis:0x2a9893ae50 @sock=#>TCPSocket:0x2a9893ad60>,
   @host="127.0.0.1", @logger=nil, @password=nil, @timeout=5,
   @db=0, @port=6379>>

>> gal_words.insert_keyword_set(['adventurous','affectionate',
   'camping','church','cooking','country','dancing','faith',
   'farm','laughter','loyal','morals','movies','music',
   'outdoors','ranch','respect','sunsets walking'])
=> ["adventurous", "affectionate", "camping", "church",
   "cooking", "country", "dancing", "faith", "farm",
   "laughter", "loyal", "morals", "movies", "music",
   "outdoors", "ranch", "respect", "sunsets walking"]

>> guy_words.insert_keyword_set(['architecture','beach',
   'camping','carpenter','considerate','creative','family',
   'funny','genuine','giving','happy','historicalhouses',
   'ireland','italy','kids','kind','laughter','loyal',
   'music','roadtrip','smile','travel','trust'])
=> ["architecture", "beach", "camping", "carpenter",
    "considerate", "creative", "family", "funny", "genuine",
    "giving", "happy", "historicalhouses", "ireland", "italy",
    "kids", "kind", "laughter", "loyal", "music", "roadtrip",
    "smile", "travel", "trust"]

>> guy_words.find_commonalities(gal_words)
=> ["laughter", "camping", "loyal", "music"]</pre>
<p>It works like a charm. Given a long list values in the set, Redis intersected them for us. Using that algorithm, though, it doesn't look like those two people have a lot in common. Unless opposites really do attract, maybe they should keep looking.  Because this was all atomic, it works in the typical web scenario where there can be multiple processes simultaneously inserting, removing, and intersecting data. No need to worry about locking.</p>
<p>The joy of using Redis is that it's simple to use, but there's considerable depth to the API. It's likely that any string, list or set value operation you can think of is there already. Keys can have TTL values, so that they time out of the data store. You can get and set in one operation. You can increment or decrement by more than one. You can pull random keys, or rename keys, or get the db size. You can push, pop, get ranges, etc... from lists, and do any set operation imaginable on sets. Complex sorting is supported.  And there's a lot more than that. The API really has impressive depth.</p>
<p>Given all of this, we come back to the title of this article - To Redis or Not to Redis. As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it <em>does</em> support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's  not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.</p>
<p>If you have a large data set that <em>cannot</em> comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-4-to-redis-or-not-to-redis/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Cassandra and Ruby: A Love Affair? (Key-Value Stores Part 3)</title>
		<link>http://www.engineyard.com/blog/2009/cassandra-and-ruby-a-love-affair/</link>
		<comments>http://www.engineyard.com/blog/2009/cassandra-and-ruby-a-love-affair/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 17:00:50 +0000</pubDate>
		<dc:creator>Kirk Haines</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Tips & Tricks]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Key-Value Stores]]></category>

		<guid isPermaLink="false">http://www.engineyard.com/blog/?p=2126</guid>
		<description><![CDATA[<p>Most of today's up and coming key-value stores are more than <em>just</em> simple key-value stores. You saw this when we looked at <a href="http://www.engineyard.com/blog/2009/key-value-stores-for-ruby-part-2-tokyo-cabinet/">Tokyo Cabinet</a> which, in addition to simple key-value capabilities, adds more sophisticated abilities, such as database-like tables. In this post we'll look at <a href="http://incubator.apache.org/cassandra/">Cassandra</a> -- a modern key-value store that continues this trend. Cassandra was originally developed by Facebook and released to open source last year. The Facebook team describes Cassandra as (Google) BigTable running on top of an Amazon Dynamo-like infrastructure.</p>
<p><a href="http://incubator.apache.org/cassandra/">Cassandra</a> is implemented using Java, and unlike Tokyo Cabinet, is designed to be distributed. One key feature of its distributed architecture is that it is an  <a href="http://www.allthingsdistributed.com/2008/12/eventually_consistent.html">eventually consistent</a> design. For Cassandra, scalability isn't about absolute speed, but about adding system capacity at a reasonable cost, while retaining reasonable speed. A data store that promises immediate consistency <a href="http://portal.acm.org/citation.cfm?doid=564585.564601">sacrifices either availability or the ability to survive network partitioning</a>, and when you write internet applications that need to scale, those are the two properties that are generally the most desirable.</p>
<p>For example, consider Twitter. As a Twitter user, which of the following options would you select?</p>
<ol>
<li>When you view your timeline, it is always correct, BUT sometimes it can't be viewed at all</li>
<li>You always view your timeline, but it sometimes takes time before the timeline reflects new posts</li>
</ol>
<p>From the loud griping that the Twitter fail whale causes, I think most people prefer availability to immediate consistency. Eventual consistency within a reasonable time period is sufficient, and that's exactly what Cassandra provides.</p>
<p>With Cassandra, a write will always succeed, but a read will not always immediately reflect the result of that write. The benefit is that you can expand the capacity of your Cassandra based storage system just by adding more nodes to it.</p>
<p>In addition to being truly scalable and decentralized (which also means that your Cassandra installation can easily be built in such a way that it spans data centers, and keeps you up and running in the event of a large space rock hitting one of them), Cassandra also sports a few other neat features. It goes beyond a simple key-value data store to offer a table-like store. The schema for those tables, just like with Tokyo Cabinet, is flexible. You can add or remove fields (which are called columns in Cassandra parlance) on the fly. Cassandra also lets you do ranged queries on the keys, and permits the use of table columns as lists. It's packed with features that resonate for the implementer of large scale applications.</p>
<p>If you think that Cassandra might be worth a look, installing it is simple. You can find the source code or precompiled binaries to <a href="http://incubator.apache.org/cassandra/#download">download</a>. There is a simpler approach, however: <code>sudo gem install cassandra</code></p>
<p>If you're using OS X, there is an additional complication—Cassandra requires the 1.6 version of the JDK, and even if you have kept up on your Apple system updates and have 1.6 installed, it is still not the default. Make sure you have <a href="http://developer.apple.com/java/download/">jdk 1.6</a> installed, and then (assuming a bash-like shell):</p>
<pre escaped="true">export JAVA_HOME=/System/Library/Frameworks \\
  /JavaVM.framework/Versions/1.6
export PATH=/System/Library/Frameworks/JavaVM.framework \\
  /Versions/1.6/Commands/:$PATH</pre>
<p>If you installed the Cassandra gem, you can start an instance of Cassandra with: <code>cassandra_helper cassandra</code></p>
<p>Cassandra uses a multidimensional data model. At the top there is a keyspace, which is referred to as a table in the <code>storage-conf.xml</code> file. The keyspace defines a high level grouping for the data, and there's typically one key space per application. Keyspaces must be defined in the <code>storage-conf.xml</code> file before startup.</p>
<p>Below the keyspace lies the column family, which is the basic unit of data organization within a keyspace. In a row oriented database, data is stored by row, with all columns grouped together. In a column oriented database, data is stored by column, with all rows grouped together. Cassandra's use of column families allows a hybrid approach. A column family allows a set of columns for a given row to be stored together. This allows you to optimize your column design in order to group commonly queried columns together. Like keyspaces, column families must be defined in the <code>storage-conf.xml</code> file before startup.</p>
<p>Next comes the key. This is the unique, permanent identifier for a records. Cassandra will index this for you.</p>
<p>Below this level Cassandra provides a couple of options that allow either one or two additional dimensions of data organization. The first of these is the column.</p>
<p>It is at the column level that Cassandra's kinship with simpler key-value stores becomes apparent. Columns are where a record's data is stored, and a column is expressed as a basic key-value relationship. <code>'birthday' =&gt; '1998-08-22'</code> Columns can be stored sorted alphabetically, or by timestamp (all column entries are timestamped). Columns can be defined on the fly.</p>
<p>The final tier of organization is an optional tier called the <em>super column</em>. This can be somewhat confusing, but a super column is really just a group of columns. Users cannot mix columns and super columns at the fourth tier of organization. Once again, users <em>must</em> define which column families contain super columns, and which contain standard columns, in <code>storage-conf.xml</code> before startup. Super columns allow you to group sets or related, sorted column data under a single name.</p>
<p>But enough with the exposition: it's time to see how it works in code!</p>
<pre lang="irb" escaped="true">$ irb
&gt;&gt; require 'rubygems'
&gt;&gt; require 'cassandra'
&gt;&gt; include Cassandra::Constants
=&gt; Object
&gt;&gt; store = Cassandra.new('Twitter')
=&gt; # nil, :Users =&gt; nil,
   :StatusRelationships =&gt; nil, :UserAudits =&gt; nil,
   :Statuses =&gt; nil, :UserRelationships =&gt; nil,
   :StatusAudits =&gt; nil}, @host="127.0.0.1", @port=9160&gt;</pre>
<p>The Cassandra gem is brought to you by <a href="http://github.com/fauna/cassandra/tree/master">Evan Weaver</a>, of Twitter, so there is a certain bias in the default <code>storage-conf.xml</code> configuration that he bundles the gem with. He provides several good schemas, though, which we can look at to understand how Cassandra really works.</p>
<pre lang="irb" escaped="true">&gt;&gt; store.insert(:Users, '12345', {'screen_name' =&gt; 'wyhaines'})
=&gt; nil
&gt;&gt; store.insert(:Users, '67890', {'screen_name' =&gt; 'wayneeseguin'})
=&gt; nil
&gt;&gt; store.insert(:Statuses, '1', {'user_id' =&gt; '67890', 'text' =&gt;
?&gt; 'Hey, what is Cassandra like?'})
=&gt; nil
&gt;&gt; store.insert(:Statuses, '2', {'user_id' =&gt; '12345', 'text' =&gt;
?&gt; '@wayneeseguin, It is great!'})
=&gt; nil
&gt;&gt; store.insert(:Statuses, '3', {'user_id' =&gt; '12345', 'text' =&gt;
?&gt; 'It is a key/value store with a lime twist.'})
=&gt; nil</pre>
<p>Using the Twitter schema, a couple of users were created, and then some status messages were created, with one field containing the user id, and another containing the text of the status message.  Each status message has a unique ID.</p>
<pre lang="irb" escaped="true">&gt;&gt; store.insert(:UserRelationships,
?&gt; '67890', {'user_timeline' =&gt; {UUID.new =&gt; '1'}})
=&gt; nil
&gt;&gt; store.insert(:UserRelationships, '12345',
?&gt; {'user_timeline' =&gt; {UUID.new =&gt; '2'}})
=&gt; nil
&gt;&gt; store.insert(:UserRelationships, '12345',
?&gt; {'user_timeline' =&gt; {UUID.new =&gt; '3'}})
=&gt; nil</pre>
<p>Using a column based database like Cassandra takes a bit of a mental shift from a simple key-value store or a typical row-oriented relational database.  Recall the hierarchy of storage—Column Family, Key, Column/Value. If each status message has a unique key, I can't just ask for all keys where  column family == ':Statuses' and  column user_id == '12345'. UserRelationships is a super column.  It's defined like this in <code>storage-conf.xml</code>.</p>
<pre lang="xml" escaped="true"></pre>
<p>This says that <code>UserRelationships</code> is a super column, and that the sort order of its subcolumns is a TimeUUIDType; that is, a time based UUID. By inserting rows keyed by the user id into <code>UserRelationships</code>, with values that are a column, <code>user_timeline</code> and subcolumns composed of a time based UUID pointing to a message key, you build a structure that provides an easy path to query all of the messages from a given user, in time sorted order.</p>
<pre lang="irb" escaped="true">&gt;&gt; my_message_relationships = store.get(:UserRelationships,
?&gt; '12345', 'user_timeline', :reversed =&gt; true)
=&gt; #=&gt;"3",
   =&gt;"2"}&gt;</pre>
<p>This query asks for the <code>UserRelationships</code> for key 12345, sorted by <code>user_timeline</code>, in reverse order. What is returned is a ordered hash keyed by the UUID timestamps, and keyed by message ids (i.e. exactly what was inserted earlier). You can use this to pull a list of recent messages.</p>
<pre lang="irb" escaped="true">&gt;&gt; my_message_relationships.values.each do |message_id|
?&gt; puts store.get(:Statuses, message_id).inspect
?&gt; end

#"It is a key/value store with a lime twist.",
  "user_id"=&gt;"12345"
}&gt;
#"@wayneeseguin, It is great!",
  "user_id"=&gt;"12345"
}&gt;
=&gt; ["3", "2"]</pre>
<p>As you can see, using Cassandra is more complicated than using a simple key-value store, even one like Tokyo Cabinet which builds a table model on a row based key-value system.  However, just like the first time you tried to learn recursion, once your perspective shifts so that you can grok it, Cassandra's structure naturally lends itself to a whole class of otherwise tricky, high labor queries.</p>
<p>The other significant drawback to Cassandra is that although column schema is fluid, and can be changed at runtime, the higher levels of data organization—keyspaces, column families, and super columns—have to be defined in an XML configuration file, <code>storage-conf.xml</code> at startup. For example, if you wanted to start a new project using the Cassandra gem, you have to create your own set of configuration files (look at <code>gems/cassandra-0.5.5/conf</code> for your ruby installation to see how the sample packaged with the Cassandra gem is structured).</p>
<p>Consider the common example of a blog. Let's say you want to be fancy and allow your blog to have user accounts so that users can see threads of their own blog comments over time. Your <code>storage-conf.xml</code> config might look like this:</p>
<pre lang="xml" escaped="true">
</pre>
<p>Cassandra is fun to work with, and using the Cassandra gem eliminates most of the hassles of manually setting it up to run while you are getting your feet wet. It offers an interesting balance of performance (it's surprisingly fast!) and features an architecture that is truly horizontally scalable.</p>
<p>Cassandra has a lot of promise, but it's also quite young, and certainly isn't bug free. The Ruby API is in a state of flux, and if you have a non-standard setup, it can be a hassle. I had a heck of a time getting it to run right on my OS X laptop, despite apparently having all prerequisites installed correctly. If you like what you see, get involved, maybe even write a DataMapper adapter for it. I think Cassandra is going to be around in the Ruby community for a long time.
<p><a href="http://www.engineyard.com/blog"><img height="98" width="61" title="logo-engineyard" alt="" class="attachment-post-thumbnail wp-post-image" src="http://www.engineyard.com/blog/wp-content/uploads/logo-engineyard.png"/></a></p>
]]></description>
		<wfw:commentRss>http://www.engineyard.com/blog/2009/cassandra-and-ruby-a-love-affair/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>

