Using Amazon's CloudFormation, cloud-init, chef and fog to automate infrastructure

Amazon recently announced their new CloudFormation API. This allows you to create what they call "stacks," which will bring up and provision various AWS resources. You are able to configure security groups, EC2 instances, RDS instances, and elastic load balancers just to name a few. With this level of support and ease of declaration, it makes you wonder how much longer things like RightScale are going to last.

In this post, I'm going to cover a couple different topics. Since there's a lot going on here, I may be brief to explain certain aspects. Hopefully I do a good job of getting the basic idea down, but just in case, here's a quick overview of what I'm attempting to accomplish:

  • create EC2 user data that can be used with cloud-init
  • have the user data kick off chef-solo to configure an instance
  • setup a simple CloudFormation template for easy launching
  • use fog to launch a CloudFormation template

EC2 user data and cloud-init

User data can be associated with EC2 instances on boot. The data can really be any arbitrary value. It is up to you to handle what is done with this data, however, the Ubuntu developers came up with a pretty cool concept: cloud-init. With the cloud-init package, you can format your user data to their specs, and have the instance perform various actions. This becomes extremely powerful. One of the most widely used features, is to setup a User-Data Script, which in cloud-init, means starting your userdata with "#!", the shebang. A user-data script will run around the same time as rc.local in the boot process making it a great place to perform service configuration.

What if you want more control, rather than relying on scripts? Some of the really cool parts of cloud-init include being able to install packages, specifying that you want apt-get to update, and even upgrade the system on boot. With the multi-part support, you use mime-types to combine multiple files together. This allows you to have both a user-script and a cloud-config portion of your EC2 user data, amongst other things.

With knowledge of cloud-init's user-scripts in mind, I started off with this basic script:

#!/bin/bash
log='/tmp/init.log'
apt-get update &>> $log
apt-get install -y ruby ruby1.8-dev build-essential wget libruby-extras libruby1.8-extras git-core &>> $log
cd /tmp
wget http://production.cf.rubygems.org/rubygems/rubygems-1.3.7.tgz &>> $log
tar zxf rubygems-1.3.7.tgz &>> $log
cd rubygems-1.3.7
ruby setup.rb --no-format-executable &>> $log
gem install ohai chef --no-rdoc --no-ri --verbose &>> $log
mkdir -p /var/chef/cache
git clone -b deploy git://github.com/crazed/cookbooks.git /var/chef/cookbooks &>> $log
mkdir /etc/chef
cat << EOF > /etc/chef/solo.rb
file_cache_path "/var/chef/cache"
cookbook_path "/var/chef/cookbooks"
json_attribs "/etc/chef/node.json"
log_location "/var/chef/solo.log"
verbose_logging true
EOF
cat << EOF > /etc/chef/node.json
{
  "www": {
    "document_root": "/srv/http",
    "server_name": "localhost"
  },
  "mysql": {
    "user": "user",
    "password": "user",
    "hostname": "localhost",
    "database": "drupal",
    "rootpw": "root"
  },
  "drupal": {
    "version": "7.0",
    "modules": [ "pathauto", "token" ]
  },
  "run_list": [ 
    "recipe[php-fpm]",
    "recipe[mysql::new_database]",
    "recipe[drupal::install]",
    "recipe[drupal::nginx-config]"
  ]
}
EOF
chef-solo &>> $log

This was cool and all, but I wanted to really separate my node attributes and solo configuration from the user-script. So I stumbled upon the part-handler portion of cloud-init. This allows you to write a bit of python to handle additional mime-types inside the EC2 user data. Here's a very basic chunk that adds support for text/chef-attributes and text/chef-solo mime types:

#part-handler
# vi: syntax=python ts=4

def list_types():
    # return a list of mime-types that are handled by this module
    return(["text/chef-attributes", "text/chef-solo"])

def handle_part(data,ctype,filename,payload):
    # data: the cloudinit object
    # ctype: '__begin__', '__end__', or the specific mime-type of the part
    # filename: the filename for the part, or dynamically generated part if
    #           no filename is given attribute is present
    # payload: the content of the part (empty for begin or end)
    if ctype == "__begin__":
       print "my handler is beginning"
       return
    if ctype == "__end__":
       print "my handler is ending"
       return

    if ctype == "text/chef-attributes":
        print "handling %s " % ctype
        import os
        attribs = '/etc/chef/node.json'
        d = os.path.dirname(attribs)
        if not os.path.exists(d):
            os.makedirs(d)
        f = open(attribs, 'w')
        f.write(payload)
        f.close()
        return

    if ctype == "text/chef-solo":
        print "handling %s" % ctype
        import os
        solo = '/etc/chef/solo.rb'
        d = os.path.dirname(solo)
        if not os.path.exists(d):
            os.makedirs(d)
        f = open(solo, 'w')
        f.write(payload)
        f.close()
        return

There's probably more effecient ways of doing the above, but this is all still a work in progress for me. Either way, continuing with the original script, I can also pull out the package installation and add this to a cloud-config portion of my user data using the following:

#cloud-config
packages:
- ruby
- ruby1.8-dev
- build-essential
- wget
- libruby-extras
- libruby1.8-extras
- git-core

Note the #cloud-config line at the begining. This tells cloud-init to handle data in the file as part of the cloud-config specification. Now that I have a cloud-config portion installing some required packages, and a part-handler to handle additional mime-types, I can put my chef node attributes and chef solo configuration into their own files.

# cat node.json
{
  "www": {
    "document_root": "/srv/http",
    "server_name": "localhost"
  },
  "mysql": {
    "user": "user",
    "password": "user",
    "hostname": "localhost",
    "database": "drupal",
    "rootpw": "root"
  },
  "drupal": {
    "version": "7.0",
    "modules": [ "pathauto", "token" ]
  },
  "run_list": [ 
    "recipe[php-fpm]",
    "recipe[mysql::new_database]",
    "recipe[drupal::install]",
    "recipe[drupal::nginx-config]"
  ]
}

# cat solo.rb
file_cache_path "/var/chef/cache"
cookbook_path "/var/chef/cookbooks"
json_attribs "/etc/chef/node.json"
log_location "/var/chef/solo.log"
verbose_logging true

I'm not going to go over the node.json part all that much, but basically my chef cookbooks will read this data in order to determine how to configure my instance. I want to set my http document root to /srv/http, setup a new mysql database, and install drupal 7.0 with the pathauto, and token modules. The solo configuration points out where I will be storing my cookbooks, which my stripped down user-script will be checking out from my github repository:

#!/bin/bash
log='/var/log/cloud-init.log'
cd /tmp
wget http://production.cf.rubygems.org/rubygems/rubygems-1.3.7.tgz &>> $log
tar zxf rubygems-1.3.7.tgz &>> $log
cd rubygems-1.3.7
ruby setup.rb --no-format-executable &>> $log
mkdir -p /var/chef/cache
git clone -b deploy git://github.com/crazed/cookbooks.git /var/chef/cookbooks &>> $log
gem install ohai chef --no-rdoc --no-ri --verbose &>> $log
chef-solo &>> $log

Now that I have all these files, I need to combine them using mime-multiparts. There's a script availabe in the cloud-utils package, but you won't really need to install this on your workstation/laptop. So just use bzr to checkout the latest version:

# bzr branch lp:ubuntu/cloud-utils
# cd cloud-utils
# ./write-mime-multipart --output=combined-userdata.txt \
part-handler.py:text/part-handler \
node.json:text/chef-attributes \
solo.rb:text/chef-solo \
cloud-config.txt \
user-script.sh

Note that I don't specify the mime-types for user-script.sh or cloud-config.txt, as they begin with the appropriate lines which cause cloud-init to handle them properly.

CloudFormation templates

So now that you have this slick user-data file, you're going to want to put more than one of these together to create a full infrastructure. This is where the CloudFormation "stacks" come into play. They are JSON based templates. This example will create you an EC2 Security Group and an EC2 instance.

{
  "AWSTemplateFormatVersion" : "2010-09-09",

  "Description": "Create a fully functioning Drupal installation." 
  "Parameters": {
    "InstanceType": {
      "Description": "Type of EC2 instance to launch",
      "Type": "String",
      "Default": "m1.small"
    },
    "WebServerPort": {
      "Description": "TCP/IP port fo the webserver",
      "Type": "String",
      "Default": "80"
    },
    "KeyName": {
      "Description": "Name of an existing EC2 KeyPair to enable SSH access to the instances",
      "Type": "String"
    }
  },

  "Mappings": {
    "AWSInstanceType2Arch": {
      "t1.micro"    : { "Arch": "64" },
      "m1.small"    : { "Arch" : "32" },
      "m1.large"    : { "Arch" : "64" },
      "m1.xlarge"   : { "Arch" : "64" },
      "m2.xlarge"   : { "Arch" : "64" },
      "m2.2xlarge"  : { "Arch" : "64" },
      "m2.4xlarge"  : { "Arch" : "64" },
      "c1.medium"   : { "Arch" : "32" },
      "c1.xlarge"   : { "Arch" : "64" },
      "cc1.4xlarge" : { "Arch" : "64" } 
    }, 
    "AWSRegionArch2AMI": {
      "us-east-1": { "32": "ami-3e02f257", "64": "ami-3202f25b" }
    }
  },

  "Resources": {
    "web1": {
      "Type": "AWS::EC2::Instance",
      "Properties": {
        "InstanceType": { "Ref": "InstanceType" },
        "SecurityGroups": [ { "Ref": "web" }, "default"  ],
        "KeyName": { "Ref": "KeyName" },
        "ImageId": { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref": "AWS::Region" },
                            { "Fn::FindInMap": [ "AWSInstanceType2Arch", { "Ref": "InstanceType" }, "Arch" ] } ] },
        "UserData": { "Fn::Base64": " SEE BELOW " }
      }
    },

    "web": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Web security group, allows port 80 in",
        "SecurityGroupIngress": [ {
          "IpProtocol": "tcp",
          "FromPort": "22",
          "ToPort": "22",
          "CidrIp": "0.0.0.0/0"
        },
        {
          "IpProtocol": "tcp",
          "FromPort": { "Ref": "WebServerPort" },
          "ToPort": { "Ref": "WebServerPort" },
          "CidrIp": "0.0.0.0/0"
        } ]
      }
    }
  }

}

Taking a look at the template, you'll see it gets a bit messy in the user data portion of my instance. Essentially you need to make sure your userdata is compliant with the JSON format. To do that, I used this simple hackish ruby script:

# cat json_encode.rb
#!/usr/bin/ruby
require 'rubygems'
require 'json'

if ARGV.size != 1
  puts "Usage: #{$0} <file>"
  exit 1
end

def escape(string)
  parse = JSON.parse({ 'json' => string }.to_json)
  parse['json']
end

data = ''
File.open(ARGV[0]) {|f| data << f.read}
p escape(data)

# ./json_encode.rb combined-userdata.txt
...

Manipulating stacks with fog

Now you have a stack template that's fairly powerful. You get to configure a few things, such as what EC2 keypair to use and what instance type you want for the stack. You can even configure what web port will be opened on security group. How do you actually use this data? While Amazon provides command line api tools, I decided to add CloudFormation support to the fog gem and use this to create a basic tool for myself. Realistically, you could take fog and build a nice interface to launch or build your own templates using your own chef cookbooks/node attributes. As of this writing, CloudFormation is not included in the released fog gem, so you're going to have to checkout the latest code from github:

# mkdir projects && cd projects
# git clone git://github.com/geemus/fog.git

With everything in place, here's a simple script that will allow you to launch a stack:

#!/usr/bin/env ruby
require 'optparse'
require "#{ENV['HOME']}/projects/fog/lib/fog"

options = {}
parser = OptionParser.new do |opts|
  opts.banner = "Usage: #{$0} -n STACK_NAME -t TEMPLATE_FILE [options]"

  options[:name] = nil
  opts.on('-n', '--name NAME', 'Required name attribute') do |name|
    options[:name] = name
  end

  options[:parameters] = {}
  opts.on('-p', '--parameters KEY=VALUE', 'Set the parameters for a stack template', 
    '(can be comma separated key value pairs)') do |params|
    params.split(',').each do |param|
      k,v = param.split('=')
      raise "Invalid parameter definition" unless v
      options[:parameters][k] = v
    end
  end

  options[:template] = nil
  opts.on('-t', '--template FILE', 'Set the template file for this stack') do |file|
    options[:template] = file
  end
end

parser.parse!
[:name, :template].each do |p|
  raise "Missing required parameter: --#{p}" unless options[p]
end

config = YAML::load(File.open(ENV['HOME']+'/.fog'))[:default]
template_body = ''
File.open(options[:template]) {|f| template_body << f.read}

cf = Fog::AWS::CloudFormation.new(
  :aws_access_key_id => config[:aws_access_key_id],
  :aws_secret_access_key => config[:aws_secret_access_key]
)

def aws_params(hash)
  r = Hash.new
  c = 1
  hash.each do |k,v|
    r["Parameters.member.#{c}.ParameterKey"] = k
    r["Parameters.member.#{c}.ParameterValue"] = v
    c += 1
  end
  r
end
cf.create_stack(options[:name], {'TemplateBody' => template_body}.merge!(aws_params(options[:parameters])))

You can use it as follows:

# ./create-stack.rb --name drupal --parameters KeyName=allan,InstanceType=m1.large --template drupal_dev.template

This will upload your template definition and have Amazon read and launch the various resources that you have defined. As these resources are launched, you can get status messages using the DescribeStackEvents API call. Again, here's a simple script that's using fog to pull down these messages:

#!/usr/bin/env ruby
require 'rubygems'
require 'yaml'
require "#{ENV['HOME']}/projects/fog/lib/fog"

if ARGV.size != 1
  puts "Usage: #{$0} <stack name>"
  exit 1
end

config = YAML::load(File.open(ENV['HOME']+'/.fog'))[:default]
cf = Fog::AWS::CloudFormation.new(
  :aws_access_key_id => config[:aws_access_key_id],
  :aws_secret_access_key => config[:aws_secret_access_key]
)

events = cf.describe_stack_events('StackName' => ARGV[0]).body['Events']
events.each do |event|
  puts "Timestamp: #{event['Timestamp']}"
  puts "LogicalResourceId: #{event['LogicalResourceId']}"
  puts "ResourceType: #{event['ResourceType']}"
  puts "ResourceStatus: #{event['ResourceStatus']}"
  puts "ResourceStatusReason: #{event['ResourceStatusReason']}" if event['ResourceStatusReason']
  puts "--"
end

And what the output will look like:

# ./describe_stack_events.rb drupal
Timestamp: 2011-03-02T17:13:08Z
LogicalResourceId: drupal
ResourceType: AWS::CloudFormation::Stack
ResourceStatus: CREATE_COMPLETE
--
Timestamp: 2011-03-02T17:13:05Z
LogicalResourceId: web1
ResourceType: AWS::EC2::Instance
ResourceStatus: CREATE_COMPLETE
--
Timestamp: 2011-03-02T17:12:27Z
LogicalResourceId: web1
ResourceType: AWS::EC2::Instance
ResourceStatus: CREATE_IN_PROGRESS
--
Timestamp: 2011-03-02T17:12:27Z
LogicalResourceId: web
ResourceType: AWS::EC2::SecurityGroup
ResourceStatus: CREATE_COMPLETE
--
Timestamp: 2011-03-02T17:12:23Z
LogicalResourceId: web
ResourceType: AWS::EC2::SecurityGroup
ResourceStatus: CREATE_IN_PROGRESS
--
Timestamp: 2011-03-02T17:12:21Z
LogicalResourceId: drupal
ResourceType: AWS::CloudFormation::Stack
ResourceStatus: CREATE_IN_PROGRESS
ResourceStatusReason: User Initiated
--

Again you could simply use the provided Amazon tools, as they are probaly more feature-complete, but sometimes you want to be able to query for specific bits of data. Learning how to use fog will save you tons of time compared to parsing data that comes out of the Amazon tools. Fog really lets you do some powerful things with the Amazon APIs and is always gaining more features.

Current shortcommings and summary

This setup has a few issues that I currently don't know how to solve. Let's say you were to launch a stack that included two EC2 instances, a load balancer, and an RDS instance. During the creation phase, you want to import a basic DB schema into your RDS instance. Since you can't get shell on the RDS instance, you will have to use one of your EC2 instances to import the schema. You have a race condition here, how do you make sure your RDS instance is available before your EC2 instance attemps to import data? Right now the only thing I can think of is to have your chef recipe gracefully handle retries. It seems we need a way for resources of a stack to depend on other resources before being launched, or a notification system that could potentially kick off the chef run (SNS?).

Either way, I think CloudFormation, chef and fog can be used in conjuction with cloud-init to create quick deployments that are OS independent. This allows you to maintain your automation portions separately from your images, which as a consequence, promotes the use of vanilla images. With these basic ideas, you can easily swap out the node.json file for your own, and replace the user-script in such a way that it checks out your own cookbooks. The possibilities are really endless. Currently cloud-init supports Puppet out of the box, and I'd like to see some chef support here as well.

Tags: