May 13 2008
8 thing you should know about Amazon EC2
I’ve been answering a lot of questions for a few friends who just got into the summer round of YC and are figuring a lot of things out. It reminds me of how much I learned over the past 11 months using Django and 3 months on Elastic Compute Cloud (EC2). There are a lot of little things that tend to only be learned the hard way or through a lot of research. I’ve listed a few of them here
1. Other than an obnoxious sign up process its not that hard to get up and running a basic app on EC2 (1 - 5 hrs)
2. The HD space isn’t yet persistent (guaranteed to be around). You get about 16MB Update: 10GB of persistent info that will stick with your instance and then many GB of storage attached as your /mnt drive. It has happened (though never to us) that everything on your /mnt drive could spontaneously disappear. This means you need to keep a DB backup. You should be doing this anyway though. Also Amazon is soon releasing tools to make an instant image of your /mnt drive, which might make the whole persistent thing a virtual non-issue.
4. Because of this you need to put all of your data on your /mnt drive, including .log files. This means you’ll have to change the default locations of your db and log files for apache. As you run your app keep an eye on your /root dir’s with “df -h” and make sure its not growing. If it is growing you’ve got a stray log file somewhere and need to get rid of it or you’ll hit your 16MB max quickly and your server will go down cause your apps can’t write anything anymore and you will be confused for a long time. We learned this one the hard way :|.
5. Use the FF extension to handle your instances. Its far from perfect. It doesn’t allow you to name the instances, and the ssh functionality doesn’t work for Mac’s as far as I’ve been able to figure out. But I think its a lot better interface than the command line tools. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=609
7. The promising ability of scaling up and down based on demand isn’t easy. RightScale offers some expensive tools/service for this. There’s also Scalr, which is an open source system already bundled into Amazon Images for you. I have yet to try this out. It sounds promising. If you’re just scaling to a hand full of servers (most cases) its easy enough to setup a load balancer such as HAProxy or nginx. With notes from my friend Justin I wrote a cookbook for setting up HAProxy previously.
8. Images taken from a Small instance cannot be loaded on L or XL instance, and vica versa. This is becasue small instances have a 32-bit architecture and L and XL instances are 64-bit.
Good luck! I’ll hopefully be posing a few more guides soon.
Updates 5/15: This article was pretty much just some notes posted to the web for a few friends. I didn’t do a lot of fact checking and have had the following facts checked/corrected for me.
The “got about 16 MB” estimate was horribly wrong. You can specify a bundle size up to 10G of free space in your bundle. The point is still the same though. Put all growing data in your /mnt dir.
Rightscale should not be referred to as “expensive”. I’m sure they’re more than worth their cost.
The latest version of Elasticfox (which I need to get) now has ssh support for macs.
I am an idiot and left out points 3 and 6. This post actually only has 6 things you should know about Amazon EC2. Never trust numbered list elements while editing a post.