Scaling Django Migrations on AWS

Django’s migration support provides developers with a fairly comprehensive toolset for managing and versioning schema and data changes. Writing migrations in code provides a helpful layer of abstraction and makes testing easier, while at the same time adding an extra layer of safety.

Running Django’s migrate command iterates through an application’s set of ordered migrations and runs each one, as long as it hasn’t already been executed. Fortunately, the migrate command is also idempotent – running it multiple times (but not simultaenously) has no ill effects. One particular area of contention however, is how to automate the running of the migrate command at deployment time without executing in parallel on all deployment targets in a cluster.

A quick Google search reveals several solutions, though each requires a good bit of manual intervention which we want to avoid. Since we run Elastic Beanstalk for our cluster management, most of those solutions aren’t a good fit because they require adding or removing cluster instances to act as a management host which isn’t easy to do with Elastic Beanstalk. Plus, not running in the Elastic Beanstalk context means we lose all of our application’s environment variables which hold our database credentials.

Behind the scenes, our Elastic Beanstalk environment uses ECS (Elastic Container Service) to manage our Docker containers. Task Definitions are used to describe how to run our containers by identifying attributes like the Docker image to run, volume mount points, memory allocations, links to other containers, the command to run, and so on. One particularly interesting feature of ECS is the ability to run both services and tasks.

As you might imagine, running a service can generally be considered a long running task that should restart itself if it stops. A task, however, only gets executed one time across the entire ECS cluster. The instance the task actually runs on is chosen by the ECS scheduler at random.

This model fits our migrate command’s needs exceptionally well.

Because Elastic Beanstalk is still a layer above ECS and since we’ll be interacting directly with ECS we need to provide it with our environment variables, create a task definition, and then run that task on the cluster.

To automate all of this we use Hubot in one of our Slack channels to do the following:

  • Run describeConfigurationSettings on our Elastic Beanstalk environment and extract the environment variables
  • Create an ECS task definition that matches our existing Django task definition, but also includes our environment variables and runs the command to ./ migrate
  • Run registerTaskDefinition to register the task definition with ECS
  • Run runTask with our registered task definition targeting our ECS cluster

(actual code is available in Part 2)

ECS randomly chooses an instance in our cluster, fetches the latest version of our Docker image, and runs ./ migrate with the same environment variables Elastic Beanstalk provides to each container it manages.

The result is our migrate command running on a single instance in our cluster. Making this the first step of our deployment process allows us to avoid an error-prone multi-step deployment process.

Continue to Part 2