Blog

Why Puppet Should Manage Your Infrastructure

By | March 7th, 2011 at 2:03PM

These days when looking to automate the setup and maintenance of even the simplest infrastructure you will typically end up with a decision between two popular configuration management tools, Puppet and Chef. There are other tools, some of which have been around much longer. However, recently the pace at which both Puppet and Chef’s development has increased indicates they’re both rapidly growing.

This article will begin with a look at some of the benefits of using Puppet to manage your infrastructure. In the coming weeks we’ll also take a look at Chef and how it can benefit your infrastructure in its own unique ways.

Breaking Down Your Infrastructure into Managable Components

Infrastructure can be relatively complex and only grows in complexity the more pieces you add to support your particular application(s). However, all infrastructure configurations can be broken down into components: individual pieces that are installed, configured and monitored. These components can be as simple as a user or a file, and as complex as multiple servers in a cluster supporting backend data.

Using a configuration management tool requires you to consider these individual pieces and their relationships. Relationships can be between servers or the individual components that make up a server.

For example, deploying a Ruby on Rails application typically requires a database, an application server and web server. These can be configured on a single machine or broken up into several machines, with one for each component. Thinking about these relationships and developing infrastructure that supports these enables you to view each component as a commodity, giving you the power to add, remove or replace each with ease and consistency.

An Overview of Puppet Resources

Puppet uses the term “resources” when describing these components. Resources in Puppet have have a type, a name and attributes that define the configuration of that resource. Here is an example of a resource:

file { "/etc/ntp.conf":
    owner => root,
    group => root,
    mode => 0644,
    source => "puppet:///ntpd/ntp.conf"
}

Resources typically begin with their type, here we are using the “file” type, after which we enclose our resource’s name and attributes in curly brackets. Here you can see our resource’s “name” is “/etc/ntp.conf”, followed by a colon.  Our attributes are anything after the colon up until we close the resource with a curly bracket.

This resource manages a component of our infrastructure, the file located at ”/etc/ntp.conf”. We’ve established that we want this file to be owned by ”root” and have permissions of “0644″. Any node that we choose to run this resource on will have this file configured with these attributes.

Defining Relationships Between Resources

Where Puppet shines, in relation to other tools, is that it empowers you to specify relationships between these resources and the modules they may be defined in. Modules can generally be thought of as a configuration containing each of our three core requirements, installation, configuration and monitoring. Modules in Puppet typically break down to classes, classes are singleton collections of resources.

Continuing with our example above we create an ntpd class as follows:

    class ntpd {
        package { "ntp":
            ensure => installed,
        }
 
        file { "/etc/ntp.conf":
            owner => root,
            group => root,
            mode => 0644,
            source => "puppet:///ntpd/ntp.conf",
            require => Package["ntp"]
        }
 
        @service { "ntpd":
            ensure => running,
            enable => true,
            hasrestart => true,
            hasstatus => true,
            require => [Package["ntp"], File["/etc/ntp.conf"]],
            subscribe => File["/etc/ntp.conf"]
        }
    }

Here we’ve established a relationship between installing, configuring and monitoring the ntp daemon. You can see we’ve defined three resource types, “package”, our “file” example from above and “@service” (the ‘@’ is a special syntax for a virtual resource, you can ignore that for now).

As I described above, each resource has a type, a name and attributes. Most of these should be self-explanatory, however the attributes that build the relationships are important.

You can see here that the file resource requires “Package['ntp']“. That simply refers to our package resource. The same requirements are defined for the @service resource, though we add in a dependency upon the file type.

The “subscribe” attribute tells our service type to listen for any changes to our file type so that we know to restart. If we make any changes to “/etc/ntp.conf”, our service will automatically restart on our next Puppet run.

Defining these relationships, Puppet builds a dependency graph in the background. This dependency graph offers you a lot of unique features that other tools don’t provide.

No Operation Mode

For example with this dependency graph we can run our setup in “dry run” mode using the ‘–noop’ flag. This allows us to test out exactly how Puppet will configure our systems. Which is great for any production infrastructure, even if you have “test” or “staging” machines to test on. “Noop” allows you to develop new infrastructure faster before having to deploy. This feature is often over looked when considering the power of Puppet compared to other tools.

Explicit Resource Dependencies

Another feature of using a dependency graph is that in Puppet, resource dependencies are always explicit. You can move resources around freely without worrying about the order of application. In our example above, if I were to move our package resource to another module, this would have no negative consequences on the process of installing and configuring ntpd. Puppet resources can listen and notify other resources that they either depend on or think might be interested. When order is important the relationship between resources must be explicitly specified. Puppet is concerned about the state of your server, and works to bring the configuration into compliance.

These are only a few of the benefits of Puppet’s graph based design. Other options such as virtual resources (which our service resource above is defined as) offer more options in dealing with related resources across similar server structures.

Getting Started with Puppet

Puppet is built with a focus on client/server configuration within an infrastructure. However Puppet provides you with an extremely easy method for getting started. One can run any independant manifest with “puppet apply <manifest>”. Using our example above, if we were to place that ntpd configuration in a file called “ntpd.pp” we could manage that service using only “puppet apply ntpd.pp”. In my experience this is far superier to getting started with standalone clients of other tools. Growing from this single manifest into a larger full blown infrastructure configuration is relatively easy.

Here is an example directory layout taken from NICS presentation. Please note that this is an advanced configuration and is meant to give you an idea of such an implementation:

    /etc/puppet/
        auth.conf
        autosign.conf
        fileserver.conf
        puppet.conf
        tagmail.conf
        files/
            byhost/
                host1/
                host2/
                host3/
        manifests/
            nodes.pp
            site.pp
            classes/
                class1.pp
                class2.pp
        modules/
            module1/
                manifests/
                    init.pp
                files/
                templates/

Generally you’ll start by defining a number of modules and import those where necessary. Puppet recognizes servers as “nodes”. We would define our “nodes” in the nodes.pp above. Here is an example of a node configuration:

    node "webserver" {
        include ntpd
    }

We have defined a node “webserver” and included our ntpd module. Each time Puppet runs on a server with the hostname “webserver” our ntpd module will be run. If there are discrepencies on the current system, Puppet will restore state back to our configuration.

Conclusion

I have tried to cover some of the defining features of Puppet along with some relatively simple examples to explain how Puppet’s design allows you to build a consistent infrastructure. Puppet, like any software has it’s warts, but you can find those types of discussions elsewhere.

I encourage you to continue exploring the documentation to expand upon what I’ve talked about here. You can also find a lot of assistance in the “#puppet” channel on the Freenode IRC network or on their mailing list.

  • http://twitter.com/chrisallnutt Chris Allnutt

    I’d be interested in a follow up post about Chef. Chef seems to be the goto for cloud deployments and opscode even provides a service for it. I don’t think anyone does a real good job differentiating the two of them.

  • http://twitter.com/fujin_ AJ Christensen

    What the hell? Doesn’t EY support Opscode anymore? Isn’t the EY Cloud platform all built around Chef? and EY Flex uses Chef in the background?

  • BraveNewCurrency

    > What the hell? Doesn’t EY support Opscode anymore?

    Wow, talk about jumping to conclusions!

  • http://darwinweb.net/ Gabe da Silveira

    Indeed, but the title really begs the question why you would choose Puppet over Chef, something which the article doesn’t address at all…

  • http://twitter.com/avivby Aviv Ben-Yosef

    Well, I found Chef to be a lot more modern and aware of the needs of the cloud: it has support for EC2, Rackspace and many others, while on Puppet one has to manage everything by himself. I wrote simple tutorials that help compare what it takes to create an EC2 instance with Apache on both:
    http://bit.ly/avivby-chef-ec2
    http://bit.ly/avivby-puppet-ec2

  • http://www.verticalsysadmin.com/ Aleksey Tsalolikhin

    Check out http://www.usenix.org/publications/login/2010-10/openpdfs/ConfigMgt10reports.pdf – it may help you differentiate Chef and Puppet.

  • http://twitter.com/asenchi Curt Micol

    AJ, perhaps you missed the paragraph where I mention that an article will be written about Chef’s benefits/features?

    EY still uses Chef in most products, this doesn’t mean we can’t look at the benefits of another technology.

  • http://twitter.com/asenchi Curt Micol

    Actually the title doesn’t suggest a comparison at all and rather addresses why Puppet could be a good fit for your infrastructure.

    A second article will come out soon regarding Chef’s benefits and why you should choose that for your setup.

  • http://twitter.com/asenchi Curt Micol

    Hello Chris,

    There are plans for a Chef article. I avoided a comparison since they really are such different implementations I could better display each tool’s features focusing on one per article.

    Stay tuned for the Chef one, I hope that provides some of the comparisons you were looking for.

  • http://darwinweb.net/ Gabe da Silveira

    Let me amend that. If you are familiar with EY’s stack and the fact that they have always used exclusively Chef, then the title does raise questions, and there’s no denying that Curt.

  • http://twitter.com/asenchi Curt Micol

    Gabe, not quite. EY currently uses Puppet for it’s xCloud product.

    Also, which questions does it raise? Are we unable to write/discuss/comment about different software?

    I hope you all enjoy the Chef, though now I feel a lot more pressure in writing it considering everyone’s feelings on the matter. :)

  • http://twitter.com/chrisallnutt Chris Allnutt

    Thanks Curt I’ll look forward to it.

  • http://darwinweb.net/ Gabe da Silveira

    Ah, see, I’ve been a customer for 4 years and I didn’t know that. I’m guessing the misconception is common.

    But specific implementations aside, I think the Chef vs Puppet debate is the interesting topic that lacks good coverage, because the number of people with truly deep experience in both is very limited.

  • http://twitter.com/asenchi Curt Micol

    Gabe, I agree with your statement, there is a lack of good coverage. However, I think “this vs. that” type of articles end up only addressing each tool’s features in a superficial manner. I hope, over the course of these two articles, to cover why each tool is awesome. They really are very different in their implementation.

  • Stephen Bannasch

    In this area there are two technologies that I am familiar with: Chef and Puppet.

    Given this context when I read the title of your article: “Why Puppet Should Manage Your Infrastructure” it’s very hard not to add on what appears to be implicit: “… instead of Chef.”

    This isn’t what the article is about however so the title is unintentionally misleading for a reasonable number of folks.

  • Melissa Sheehan

    Hey Chris. The Chef post is here if you want to check it out: http://www.engineyard.com/blog/2011/why-chef-should-manage-deploying-your-application/

  • Anonymous

    What implications or what is the best way that Chef could fit in with an architecture using virtualisation, such as Xen?

  • http://blog.gjunka.com/professional/2011/07/introduction Introduction | My Personal Blog

    [...] are other configuration management tools available, including Puppet, and there is an ongoing debate which one is better for particular types of configuration work. I [...]

  • thisisme

    This is my comment about Chef:

    One of the problems I have with Chef is not that I think it’s a bad
    idea, but the people who are using it have no idea about underlying
    operating system functions to be able to know what’s going on with the
    configs they put in place. The world of systems administration is in
    the hands of a bunch of developers that have gotten tired of being
    developers. So in order to compensate for not knowing anything they
    make things like Chef to make them cool. Example is this function
    that’s in place to create snapshots of EBS store. The Chef setup
    creates a bash script that’s executed to make the snapshot. I really
    fail to see the reason for this since you could just, I dunno, create
    the bash script to be useable already across all the different types of
    machines you have. Wow what a concept huh? Chef and other means again
    are great but do too much to reinvent the wheel at the same time.

  • Jai Singh

    I think you can use the right-aws gem and create a snapshot directly from your chef recipe. No need to goto bash script at all.