Recently, I've been standing up a lot of infrastructure in Amazon's EC2. While auto-scaling the infrastructure aspects is fairly simple using configuration automation utilities such as Chef or Puppet, getting developers to understand these concepts may be an interesting task. I will try to outline most of the common issues I see when uplifting applications to an elastic cloud environment.
Rules to live by
Infrastructure automation is critical
You need to know how to build the infrastructure that runs your application, and you need to know how to do this repeatedly. Using tools such as Chef or Puppet will allow you to do this efficiently and in a self-documenting way that also provides revision control to your software configurations. Being able to do this allows you to quickly bring up additional server resources as your application demands it.
Revision control is not optional
Choose your favorite source control system and learn how to use every aspect of it. These tools were meant to keep track of your code changes. When dealing with the cloud, you're going to need to keep track of what is in you production environment vs your staging environment.
One of the biggest changes with a cloud based infrastructure, is that the underlying resources will be changing. There will be monitors that detect high traffic, CPU usage, memory usage, or anything really. This will cause additional virtual machines (instances) to launch, configure themselves (see above), check out the latest version of your code from your repository, and finally add itself to a load balancer. When your application's server infrastructure can build itself, you better be able to checkout the latest revision from your deployment branch and have it run with no human interaction.
Don't rely on traditional shared storage
I've seen many legacy applications rely on shared storage of some sort. This will not work in most cloud environments, and definitely will not work in Amazon. As of this writing, Amazon's Elastic Block Storage (EBS) volumes can only be attached to one server instance. This means you cannot utilize a clustered filesystem and standing up your own NFS server leaves for a single point of failure.
The solution? Put all of your site content that isn't revision controlled into an asset host or content delivery network (CDN). When on Amazon, this means using Amazon's Simple Storage Service (S3). By doing this you are storing your precious images, movies, and other content on storage that is 99.999999999% available and accessible anywhere over HTTP. As an added bonus, you will be serving up images faster and can achieve better edge network performance by enabling CloudFront, Amazon's CDN service.
Relational databases aren't evil
There's a lot of hype over NoSQL databases such as MongoDB and CouchDB. They are touted as being "webscale" because you can add and remove instances from the database cluster as needed. These are great if you're starting a new application and have decided you need these features, but what if you have existing data in MySQL?
Amazon's Relational Database Service (RDS) supports MySQL databases that are both redundant and scalable to your needs. You can easily grow the database storage size, cpu, and memory resources. They also perform automatic backups in a timeslot that you configure. This makes transitioning from traditional database infrastructure to the cloud fairly painless. They have also announced plans to support Oracle 11g in the future on the same platform. So while scaling relational databases are not as trivial as using MongoDB, Amazon has really stepped up to the plate by taking care of this problem for you.
By utilizing asset hosts, maintaining an automated server build, and keeping strict revision control you will be setting yourself up for success in today's cloud environment, however, I cannot stress the importance of at least having a staging environment. Have unit test and force all commits to be tested, then deployed to your staging environment. This allows you to quickly catch any code issues. An added benefit to your staging environment, is that you can test infrastructure changes before deciding to put the changes in the wild.