Archive for the 'Deploying' Category

Jul 09 2008

Working from the Django Trunk

Published by Dave under Deploying, Django

I just finished debuging a confusing issue on socialbrowse.com and wanted to make a quick note about working out of the django trunk.  I’ve been happily working from the django-trunk for over a year now and ignorantly updating whenever I felt the fancy or I found a new feature to utilize.

Django is under a great deal of development by its very active community and awesome new features are added all the time.  There hasn’t had an official “release” since March 2007 and there have been so many improvements that it is a huge disadvantage to not work from the trunk. A few people criticize the slow release process but most are content with the the historically very stable trunk.

The issue I had, however, was with dealing with backwards incompatibility.  I had never run into an issue where my newly updated trunk would not run the code that worked on an old version.  I was so confident in the trunk in fact that it didn’t occur to me for a full hour that the issue may have been with the Django update I had done the day before and not with the large chunk of my own code I had just loaded.It was really a brain dead, but easy to make, mistake.

Too confident in Django I updated mindlessly on the live server before I tested on the development server.  Luckily it wasn’t a major issue.  For about 2 hours while I was cursing and testing, our users couldn’t change their profile pictures.  After realizing it was nothing I had done the #django IRC was quick to point out the new file upload changes.

Moral of the story:  Watch out for backward incompatibilities and make sure your development and live servers are on the same revision.

No responses yet

May 13 2008

8 thing you should know about Amazon EC2

Published by Dave under Deploying, EC2

I’ve been answering a lot of questions for a few friends who just got into the summer round of YC and are figuring a lot of things out.  It reminds me of how much I learned over the past 11 months using Django and 3 months on Elastic Compute Cloud (EC2).  There are a lot of little things that tend to only be learned the hard way or through a lot of research.   I’ve listed a few of them here

1.    Other than an obnoxious sign up process its not that hard to get up and running a basic app on EC2 (1 - 5 hrs)
2.    The HD space isn’t yet persistent (guaranteed to be around).  You get about 16MB Update: 10GB of persistent info that will stick with your instance and then many GB of storage attached as your /mnt drive.  It has happened (though never to us) that everything on your /mnt drive could spontaneously disappear.  This means you need to keep a DB backup.  You should be doing this anyway though.  Also Amazon is soon releasing tools to make an instant image of your /mnt drive, which might make the whole persistent thing a virtual non-issue.
4.    Because of this you need to put all of your data on your /mnt drive, including .log files.  This means you’ll have to change the default locations of your db and log files for apache.  As you run your app keep an eye on your /root dir’s with “df -h” and make sure its not growing.  If it is growing you’ve got a stray log file somewhere and need to get rid of it or you’ll hit your 16MB max quickly and your server will go down cause your apps can’t write anything anymore and you will be confused for a long time.  We learned this one the hard way :|.
5.    Use the FF extension to handle your instances.  Its far from perfect.  It doesn’t allow you to name the instances, and the ssh functionality doesn’t work for Mac’s as far as I’ve been able to figure out.  But I think its a lot better interface than the command line tools.  http://developer.amazonwebservices.com/connect/entry.jspa?externalID=609
7.    The promising ability of scaling up and down based on demand isn’t easyRightScale offers some expensive tools/service for this.   There’s also Scalr, which is an open source system already bundled into Amazon Images for you.  I have yet to try this out.  It sounds promising.  If you’re just scaling to a hand full of servers (most cases) its easy enough to setup a load balancer such as HAProxy or nginx.   With notes from my friend Justin I wrote a cookbook for setting up HAProxy previously.
8.    Images taken from a Small instance cannot be loaded on L or XL instance, and vica versa.  This is becasue small instances have a 32-bit architecture and L and XL instances are 64-bit.

Good luck!  I’ll hopefully be posing a few more guides soon.

Updates 5/15:  This article was pretty much just some notes posted to the web for a few friends.  I didn’t do a lot of fact checking and have had the following facts checked/corrected for me.

The “got about 16 MB” estimate was  horribly wrong.  You can specify a bundle size up to 10G of free space in your bundle. The point is still the same though.  Put all growing data in your /mnt dir.

Rightscale should not be referred to as “expensive”.  I’m sure they’re more than worth their cost.

The latest version of Elasticfox (which I need to get) now has ssh support for macs.

I am an idiot and left out points 3 and 6.  This post actually only has 6 things you should know about Amazon EC2.  Never trust numbered list elements while editing a post.

No responses yet

Mar 03 2008

Session Based Load Balancing with HAproxy

Published by Dave under Deploying, Django

A friend is asking how to install HAproxy because I’d just done it three weeks ago, so I’m publishing my install notes, most of which I stole from another friend.

HAproxy is a pretty sweet tool but currently has little written about it so installation and configuration takes reading a large text manual. Here though is a basic setup that should work for most cases.

mkdir /opt/haproxy
cd /opt/haproxy

#Download latest version from http://haproxy.1wt.eu/download/1.3/bin/ e.g.
wget http://haproxy.1wt.eu/download/1.3/bin/haproxy-1.3.14.2-pcre-40kses-splice-linux-i586.notstripped.gz -O haproxy.gz

gunzip haproxy.gz
chmod 700 haproxy

Now you have HA proxy installed but need a config file. Make a new file called /etc/haproxy.cfg and add the following contents.

global
log 127.0.0.1 daemon debug
maxconn 4096

pidfile /var/run/haproxy.pid
daemon
defaults
log global
mode http

option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm <HAproxy host ip>:80
mode http
cookie MYSITECOOKIE insert nocache
balance roundrobin
server app1 <server1 ip>:8080 cookie app1inst1 check
server app2 <server2 ip>:8080 cookie app2inst2 check
server app3 <server3 ip>:8080 cookie app3inst3 check

This config file sets up session based balancing; meaning HAproxy will inject its own cookie into each person who visits the load balancer. Then every requests the person with that cookie makes from that point on will get directed to the same server! This is most important for apps with content heavily based around user information. Without it its really tough to make use of local caching.

Make sure you change all of the parameters inside <> brackets. You can add or remove servers by editing the “server” commands. I think you can see the trend there :). Also you’ll want to change the MYSITECOOKIE, which is the name of the cookie that HAproxy will inject and the variables app* and app*inst* are made up. You can call them whatever you want.

You should also notice that in this setup HAproxy is listening to port 80 ( <HAproxy host ip>:80 ) and that its redirecting to port 8080 on the re-direct servers (<server1 ip>:8080 ). This is because often one of the re-direct servers is usually the same as the HAproxy host. Also its not a good idea to have random IP’s that aren’t directly linked to your domain to be hosting your app on port 80.

So, next we have to change your app instance to listen to port 8080. I’ve only done this with Apache2 and will only go through those steps. Its very simple and is probably as simple with other servers. You need to edit your ports.conf (often found in /etc/apache2/ports.conf ) and add, or change the following line at the top.

#ports.conf
Listen 8080

Save it, restart Apache2, and that’s it.

Lastly here are the directions for running the proxy. You could bundle these in a script but I use them so rarely I didn’t. There is one command for kicking off the proxy, another for verifying a new configuration file if you make an update (for when you add or remove a server) and finally a third that will dynamically reload your new configuration.

First time start:
/opt/haproxy/haproxy -f /etc/haproxy.cfg -p /var/run/haproxy.pid

Verify a valid configuration:
/opt/haproxy/haproxy -f /etc/haproxy.cfg -c

Dynamically reload configuration:
/bin/sh -c “/opt/haproxy/haproxy -f /etc/haproxy.cfg -p /var/run/haproxy.pid -st $(</var/run/haproxy.pid)”

I’m no expert. Like I said I got most of this from my friend Justin (Thanks Justin!). But it was a pain sifting through the manual and I thought others might benefit from these notes. For more information visit the HAproxy docs.

Note: Interestingly this post by Justin.tv promoting Nginx over Pound just made it to the top of Hacker News as I was finishing. I want to add my recommendation of HAproxy. I also tried Nginx but ended up going with HAproxy. There is certainly more articles on using Nginx but it is more difficult to set up session based balancing.

4 responses so far