Most of today’s up and coming key-value stores are more than just simple key-value stores. You saw this when we looked at Tokyo Cabinet which, in addition to simple key-value capabilities, adds more sophisticated abilities, such as database-like tables. In this post we’ll look at Cassandra — a modern key-value store that continues this trend. Cassandra was originally developed by Facebook and released to open source last year. The Facebook team describes Cassandra as (Google) BigTable running on top of an Amazon Dynamo-like infrastructure.
Cassandra is implemented using Java, and unlike Tokyo Cabinet, is designed to be distributed. One key feature of its distributed architecture is that it is an eventually consistent design. For Cassandra, scalability isn’t about absolute speed, but about adding system capacity at a reasonable cost, while retaining reasonable speed. A data store that promises immediate consistency sacrifices either availability or the ability to survive network partitioning, and when you write internet applications that need to scale, those are the two properties that are generally the most desirable.
For example, consider Twitter. As a Twitter user, which of the following options would you select?
- When you view your timeline, it is always correct, BUT sometimes it can’t be viewed at all
- You always view your timeline, but it sometimes takes time before the timeline reflects new posts
From the loud griping that the Twitter fail whale causes, I think most people prefer availability to immediate consistency. Eventual consistency within a reasonable time period is sufficient, and that’s exactly what Cassandra provides.
With Cassandra, a write will always succeed, but a read will not always immediately reflect the result of that write. The benefit is that you can expand the capacity of your Cassandra based storage system just by adding more nodes to it.
In addition to being truly scalable and decentralized (which also means that your Cassandra installation can easily be built in such a way that it spans data centers, and keeps you up and running in the event of a large space rock hitting one of them), Cassandra also sports a few other neat features. It goes beyond a simple key-value data store to offer a table-like store. The schema for those tables, just like with Tokyo Cabinet, is flexible. You can add or remove fields (which are called columns in Cassandra parlance) on the fly. Cassandra also lets you do ranged queries on the keys, and permits the use of table columns as lists. It’s packed with features that resonate for the implementer of large scale applications.
If you think that Cassandra might be worth a look, installing it is simple. You can find the source code or precompiled binaries to download. There is a simpler approach, however: sudo gem install cassandra
If you’re using OS X, there is an additional complication—Cassandra requires the 1.6 version of the JDK, and even if you have kept up on your Apple system updates and have 1.6 installed, it is still not the default. Make sure you have jdk 1.6 installed, and then (assuming a bash-like shell):
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6 export PATH=/System/Library/Frameworks/JavaVM.framework//Versions/1.6/Commands/:$PATH
If you installed the Cassandra gem, you can start an instance of Cassandra with: cassandra_helper cassandra
Cassandra uses a multidimensional data model. At the top there is a keyspace, which is referred to as a table in the storage-conf.xml file. The keyspace defines a high level grouping for the data, and there’s typically one key space per application. Keyspaces must be defined in the storage-conf.xml file before startup.
Below the keyspace lies the column family, which is the basic unit of data organization within a keyspace. In a row oriented database, data is stored by row, with all columns grouped together. In a column oriented database, data is stored by column, with all rows grouped together. Cassandra’s use of column families allows a hybrid approach. A column family allows a set of columns for a given row to be stored together. This allows you to optimize your column design in order to group commonly queried columns together. Like keyspaces, column families must be defined in the storage-conf.xml file before startup.
Next comes the key. This is the unique, permanent identifier for a records. Cassandra will index this for you.
Below this level Cassandra provides a couple of options that allow either one or two additional dimensions of data organization. The first of these is the column.
It is at the column level that Cassandra’s kinship with simpler key-value stores becomes apparent. Columns are where a record’s data is stored, and a column is expressed as a basic key-value relationship. 'birthday' => '1998-08-22' Columns can be stored sorted alphabetically, or by timestamp (all column entries are timestamped). Columns can be defined on the fly.
The final tier of organization is an optional tier called the super column. This can be somewhat confusing, but a super column is really just a group of columns. Users cannot mix columns and super columns at the fourth tier of organization. Once again, users must define which column families contain super columns, and which contain standard columns, in storage-conf.xml before startup. Super columns allow you to group sets or related, sorted column data under a single name.
But enough with the exposition: it’s time to see how it works in code!
$ irb
irb(main):001:0> require 'rubygems'; require 'cassandra'; include Cassandra::Constants
=> Object
irb(main):002:0> store = Cassandra.new('Twitter')
=> #<Cassandra:70225222080780, @keyspace="Twitter", @schema={:Usernames => nil, :Users => nil, :StatusRelationships => nil, :UserAudits => nil, :Statuses => nil, :UserRelationships => nil, :StatusAudits => nil}, @host="127.0.0.1", @port=9160>
The Cassandra gem is brought to you by Evan Weaver, of Twitter, so there is a certain bias in the default storage-conf.xml configuration that he bundles the gem with. He provides several good schemas, though, which we can look at to understand how Cassandra really works.
irb(main):003:0> store.insert(:Users, '12345', {'screen_name' => 'wyhaines'})
=> nil
irb(main):004:0> store.insert(:Users, '67890', {'screen_name' => 'wayneeseguin'})
=> nil
irb(main):005:0> store.insert(:Statuses, '1', {'user_id' => '67890', 'text' => 'Hey, what is Cassandra like?'})
=> nil
irb(main):006:0> store.insert(:Statuses, '2', {'user_id' => '12345', 'text' => '@wayneeseguin, It is great!'})
=> nil
irb(main):007:0> store.insert(:Statuses, '3', {'user_id' => '12345', 'text' => 'It is a key/value store with a lime twist.'})
=> nil
irb(main):008:0> store.insert(:UserRelationships, '67890',{'user_timeline' => {UUID.new => '1'}})
=> nil
irb(main):009:0> store.insert(:UserRelationships, '12345',{'user_timeline' => {UUID.new => '2'}})
=> nil
irb(main):010:0> store.insert(:UserRelationships, '12345',{'user_timeline' => {UUID.new => '3'}})
=> nil
storage-conf.xml.
<ColumnFamily CompareWith="UTF8Type" CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" Name="UserRelationships" />
UserRelationships is a super column, and that the sort order of its subcolumns is a TimeUUIDType; that is, a time based UUID. By inserting rows keyed by the user id into UserRelationships, with values that are a column, user_timeline and subcolumns composed of a time based UUID pointing to a message key, you build a structure that provides an easy path to query all of the messages from a given user, in time sorted order.
irb(main):011:0> my_message_relationships = store.get(:UserRelationships, '12345', 'user_timeline', :reversed => true)
=> #<OrderedHash {<Cassandra::UUID#70039670470560 time: Fri Aug 28 10:09:13 -0500 2009, usecs: 76042 jitter: 15593315074131154323>=>"3", <Cassandra::UUID#70039670467740 time: Fri Aug 28 10:08:40 -0500 2009, usecs: 756348 jitter: 15886038034184122516>=>"2"}>
UserRelationships for key 12345, sorted by user_timeline, in reverse order. What is returned is a ordered hash keyed by the UUID timestamps, and keyed by message ids (i.e. exactly what was inserted earlier). You can use this to pull a list of recent messages.
irb(main):012:0> my_message_relationships.values.each {|message_id| puts store.get(:Statuses, message_id).inspect}
#<OrderedHash {"text"=>"It is a key/value store with a lime twist.", "user_id"=>"12345"}>
#<OrderedHash {"text"=>"@wayneeseguin, It is great!", "user_id"=>"12345"}>
=> ["3", "2"]
storage-conf.xml at startup. For example, if you wanted to start a new project using the Cassandra gem, you have to create your own set of configuration files (look at gems/cassandra-0.5.5/conf for your ruby installation to see how the sample packaged with the Cassandra gem is structured).
Consider the common example of a blog. Let’s say you want to be fancy and allow your blog to have user accounts so that users can see threads of their own blog comments over time. Your storage-conf.xml config might look like this:
<Keyspace Name="Blogofmine"> <ColumnFamily CompareWith="TimeUUIDType" Name="Posts" /> <ColumnFamily CompareWith="TimeUUIDType" Name="Comments" /> <ColumnFamily CompareWith="UTF8Type" Name="Users" /> <ColumnFamily CompareWith="UTF8Type" CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" Name="UserComments" /> </Keyspace>

Cassandra looks great. I wonder how it compares to MongoDB, which has auto-sharding support that lets you add machines dynamically. http://www.mongodb.org/display/DOCS/Sharding
MongoDB doesn't have tables, but rather, it is schema-less, you define the schema as you put objects into it, and you can just add fields, or omit them as you please.
I have started a chef recipe for the install and config, if anyone is interested.. http://chef-cassandra.notlong.com
I'd be interested to know how it compares to MongoDB which just hit 1.0 recently.
Hi. I'm going to be doing a MongoDB article in a few weeks, so I can better address the comparison between the two of them there, and I am making a note to focus on that comparison as part of the article. I haven't had enough time really putting MongoDB through its paces to give a good, objective comparison just yet, though, and I don't want to throw out my vauge impressions at this point because I may well change my mind.
Last I looked MongoDB didn't actually support partitioned queries except by pkey; hopefully this has changed.
In mongodb you can do any query. If one of the terms of the query is the partition key, the query will be executed on the smallest possible set of shards. If the query is general it will go to all shards and collect the results.
MongoDB sharding is currently in alpha (the core database is full production now).
From mongo's website: "Mongo does not support full master-master replication. However, for certain restricted use cases master-master can be used. Generally, we recommend one does not use the database in a master-master mode."
That's a bit different than what Cassandra offers, isn't it?
FYI if you search for Java Preferences in Spotlight, you can pick which version it will load first. In Snow Leopard I only see Java SE 6 so it shouldn't be an issue.
Java Preferences is the Apple-supported way to go. If you're using Java Preferences, you also want to set your JAVA_HOME like so:
export JAVA_HOME=
/usr/libexec/java_home(note the backticks). Also, if you use Java Preferences to pick your JDK, you don't have to modify your PATH. More info:
http://lists.apple.com/archives/java-dev/2009/Jun...
MongoDB is more like a SQL database e.g. MySQL. Replication is master/slave and possibly master/master in the future, similar to MySQL (also goes for sharding). MongoDB supports rich querying compared to other schemaless, document stores. The main difference to classic models (SQL) is that MongoDB is schemaless.
Cassandra is a distributed storage. This scales much longer than "classic" replication. The downside: Cassandra has not as rich query capabilities.
Cheers Stephan http://www.codemonkeyism.com
I am on Windows 7 – is there a way to get the cassandra gem? Running "gem install cassandra" from command line console the command fails as it attempts to run nmake.
Thanks, Erez