Using Amazon S3 to Store your Django Site's Static and Media Files

Editor's note: This post was updated in September 2017.

Using Amazon S3 to Store your Django Site's Static and Media Files

Storing your Django site's static and media files on Amazon S3, instead of serving them yourself, can improve site performance. It frees your servers from handling static files themselves, lets you scale your servers easier by keeping media files in a common place, and is a necessary step to using Amazon CloudFront as a Content Delivery Network (CDN).

We'll describe how to set up an S3 bucket with the proper permissions and configuration, how to upload static and media files from Django to S3, and how to serve the files from S3 when people visit your site.

Versions

These instructions were tested in June 2017 using Django version 1.11, django-storages version 1.5.2, boto3 version 1.44, and Python 3.6, and the AWS console as of that time. I used an S3 bucket in the us-east-2 region.

This is an update of a post originally published in November 2014. Since then, Django has been upgraded from version 1.7, django-storages has gone dormant, been forked, then the fork has taken over as django-storages again, there’s been a complete redesign of the AWS console, and we’ve switched from Python 2 to Python 3 whenever possible.

The parts about permissions and the AWS console have been completely rewritten. I've improved the approach to granting permissions on the bucket and added more thorough explanations of the settings. Some changes in Django and django-storages have required changes to other parts of the post.

Django file handling

To keep this post from being too long, it assumes knowledge of information in another post, Advanced Django File Handling.

S3 Bucket Access

One of the things I've always found tricky about using S3 this way is getting all the permissions set up so that the files are public but read-only, while allowing AWS users I choose to update the S3 files.

The approach I'm going to describe here isn't the simplest possible way to do that, but will be easier to maintain over the life of the site.

Here are the steps we'll take:

  1. Create a bucket
  2. Configure the bucket for public access from the web
  3. Create an AWS user group
  4. Add a policy to the user group that lets members of the group control the bucket
  5. Create a user and add it to that group.

Note: AWS has changed the interface for S3 in their web console between the original writing of this post and the current updated post, which may happen again. I'll try to explain what we're doing in enough detail to help find things anyway.

(Caveat: this approach assumes that you're using IAM users created in the same AWS account that owns the S3 bucket, which should almost always be the case. If for some reason that's not possible, this approach will not work and you'll need to do further research on granting access using object ACLs).

Create bucket

Start by creating a new S3 bucket, using the "Create bucket" button on the S3 page in the AWS console.

Enter a name and choose a region. Leave everything else at the default settings, clicking through until the bucket has been created.

Enable web access

The first thing we'll do is enable the bucket for web access.

Click on the newly created S3 bucket, click on the Properties tab, and click on the big "Static website hosting button". Select "Use this bucket to host a website", and fill in anything you want as the index and error documents; we won't be using them.

Click "Save".

Note that enabling web access for the bucket only hooks the bucket up to the AWS web servers to make HTTP access possible. You must also set the permissions so that anonymous users can actually read the files in your bucket. We'll do that soon.

CORS

Another thing you need to be sure to set up is CORS. CORS defines a way for client web applications loaded in one domain to interact with resources in a different domain. Since we're going to be serving our static files and media from a different domain, if you don't take CORS into account, you'll run into mysterious problems, like Firefox not using your custom fonts for no apparent reason.

Go to your S3 bucket properties, and under "Permissions", click on "CORS Configuration". Make sure that something like this is set. (At the time of writing, this was the default.)

<CORSConfiguration>
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <AllowedHeader>Authorization</AllowedHeader>
    </CORSRule>
</CORSConfiguration>

There are plenty of explanations elsewhere on the web, so I won’t go into this here. The tricky part is knowing you need to have CORS in the first place.

Public permissions

Now let's start setting up the right permissions.

In the AWS console, click the "Permissions" tab, then on the "Bucket policy" button. An editing box will appear. Paste in the following:

{
  "Version":"2012-10-17",
  "Statement":[{
    "Sid":"PublicReadGetObject",
        "Effect":"Allow",
      "Principal": "*",
      "Action":["s3:GetObject"],
      "Resource":["arn:aws:s3:::example-bucket/*"
      ]
    }
  ]
}

Don't hit save quite yet! Look just above the editing box and you should see something like this:

Bucket policy editor ARN: arn:aws:s3:::danblogpostbucket

Copy the part that starts with "arn:" and save it somewhere; we'll need it again later. This is the official complete name of our S3 bucket that we can use to refer to our bucket anywhere in AWS.

Replace the "arn:aws:s3:::example-bucket" in the editing box with the part starting with "arn:" shown above your editing box, but be careful to preserve the "/*" at the end of it. For example, change "arn:aws:s3:::example-bucket/*" to "arn:aws:s3:::danblogpostbucket/*".

Now you can save it.

It's tempting at this point to upload a test file to your bucket and try to access it from the web, but resist for a moment. While everything is now set up, it can take a few minutes for the settings to take effect, and it's confusing if you've got everything right and yet your first test fails. We'll finish the setup, then test everything.

To recap: once these settings have taken effect, anyone should be able to access any file in your bucket by visiting http://<BUCKETNAME>.s3-website-<REGION>.amazonaws.com/<FILENAME>.

Management permissions

Our next goal is to arrange for our own servers to be able to manage the files in our S3 bucket. We'll start by creating a new group in IAM. When we finish, we'll be able to add and remove users from this group to control whether those users can manage this bucket's files.

Group

In the AWS console, go to IAM, click "Groups" on the left, then "Create New Group". Assign a meaningful name to your group, like "manage-danblogpostbucket". Click through the rest of the process without changing anything else.

Policy

Next, we'll create a policy with rules allowing management of our S3 bucket. Still in IAM, click "Policies" on the left. Click "Create policy" at the top. On the next page, select "Copy an AWS Managed Policy". Search for "S3", then select "AmazonS3FullAccess".

Give your new policy a meaningful name, like "manage-danblogpostbucket". Then in the "Policy Document" window, look for the part that says "Resource": "*". Change the "*" to:

["arn:aws:s3:::danblogpostbucket",
 "arn:aws:s3:::danblogpostbucket/*"]

changing the bucket name, of course.

At the bottom, click "Validate Policy" to make sure there aren't any typos, then "Create Policy".

Now your policy exists. We just need to tell AWS which users to apply it to. Check the checkbox next to your new policy in the list, then pull down the "Policy actions" button at the top and select "Attach". On the next page, check the group we created, then click the "Attach policy" button at the bottom right.

We're almost done. We just need to create a new IAM user that our servers can use at runtime.

First User

Still in IAM, click "Users" on the left, then "Add user". Give the user a name, e.g. "blogpostbucketuser", and choose the "Programmatic" access type. Click "Next". Now you can check our group to add the new user to it, then click "Next" again, and finally "Create new user".

You'll need the user's access key and secret access key to configure the servers. The page you're on now should have a "Download .csv" button. Just click that and save the downloaded file, which will have the username, access id, and secret access key in it.

If you accidentally mess up in downloading the credentials or lose them, you can't fetch them again. But you can just delete this user, and create a new one the same way. Luckily, this part is pretty quick.

Expected results:

  • The site can use the access key ID and secret key associated with the user's access key to access the bucket
  • The site will be able to do anything with that bucket
  • The site will not be able to do anything outside that bucket

A quick test

Just to be sure the web access is working, go back to the S3 bucket in the AWS console, upload a file, then try to access it. For example, upload "foo.html" and then try to access http://danblogpostbucket.s3-website.us-east-2.amazonaws.com/foo.html (changing that to your own bucket's URL, of course).

S3 for Django static files

So, how do we tell Django we want to keep our files on S3?

The simplest case is just using S3 to serve your static files.

Before continuing, you should be familiar with managing static files, the staticfiles app, and deploying static files in Django.

Also, your templates should never hard-code the URL path of your static files. Use the static tag instead:

{% load static %}
<img src="{% static 'images/rooster.png' %}"/>

That will use whatever the appropriate method is to figure out the right URL for your static files.

(Previously, you had to use {% load static from staticfiles %}. In Django 1.11, that is no longer necessary; {% load static %} is sufficient.)

Finally, you might want to take a look at another blog post, Advanced Django File Handling, which goes into more detail about how Django static and media file handling works. We'll be using those features here.

Moving your static files to S3

In order for your static files to be served from S3 instead of your own server, you need to arrange for two things to happen:

  1. When you serve pages, any links in the pages to your static files should point at their location on S3 instead of your own server.
  2. Your static files are on S3 and accessible to the website's users.

Part 1 is easy if you've been careful not to hardcode static file paths in your templates. Just change STATICFILES_STORAGE in your settings. We'll show how to do that in a second.

But you still need to get your files onto S3, and keep them up to date. You could do that by running collectstatic locally, and using some standalone tool to sync the collected static files to S3 at each deploy. But that won't work for media files, so we might as well go ahead and set up the custom Django storage we'll need now, and then our collectstatic will copy the files up to S3 for us.

We're going to change the file storage class for static files to a new class, storages.backends.s3boto3.S3Boto3Storage, that will do that for us. Instead of STATIC_ROOT and STATIC_URL, it'll look at a group of settings starting with AWS_ to know how to write files to the storage, and how to make links to files there.

To start, install two Python packages: django-storages (yes, that's "storages" with an "S" on the end), and boto3:

$ pip install django-storages boto3

For future reference, here's the complete doc on django-storages S3 settings.

Add 'storages' to INSTALLED_APPS:

INSTALLED_APPS = (
      ...,
      'storages',
 )

If you want, add this to your common settings (optional):

AWS_S3_OBJECT_PARAMETERS = {
    'Expires': 'Thu, 31 Dec 2099 20:00:00 GMT',
    'CacheControl': 'max-age=94608000',
}

This will tell boto that when it uploads files to S3, it should set properties on them so that when S3 serves them, it'll include some HTTP headers in the response. Those HTTP headers, in turn, will tell browsers that they can cache these files for a very long time.

This setting is not a literal list of HTTP response headers. The list of allowed values is here.

Now, add this to your settings, changing the first four values as appropriate:

AWS_STORAGE_BUCKET_NAME = 'BUCKET_NAME'
AWS_S3_REGION_NAME = 'REGION_NAME'  # e.g. us-east-2
AWS_ACCESS_KEY_ID = 'xxxxxxxxxxxxxxxxxxxx'
AWS_SECRET_ACCESS_KEY = 'yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'

# Tell django-storages the domain to use to refer to static files.
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME

# Tell the staticfiles app to use S3Boto3 storage when writing the collected static files (when
# you run `collectstatic`).
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'

Add all this to your settings, then customize the first four lines for your own S3 storage.

Try it

With all of this set up, you should be able to upload your static files to S3 using collectstatic:

$ python manage.py collectstatic

If you see any errors, double check all the steps above.

Once that's successful, you should be able to start your test site and view some pages:

$ python manage.py runserver --nostatic

Look at the page source and you should see that the images, CSS, and JavaScript are being loaded from S3 instead of your own server. Any media files should still be served as before.

Don't put this into production quite yet, though. We still have some changes to make.

Configuring Django media to use S3

We might consider changing DEFAULT_FILE_STORAGE to storages.backends.s3boto.S3Boto3Storage, which is the same class we used for static files. Django file storage classes provide a standard interface that both static files and media files can use, like this:

# DO NOT DO THIS!
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3Boto3Storage'

Adding those settings would indeed tell Django to save uploaded files to our S3 bucket, and use our S3 URL to link to them.

Unfortunately, this would store our media files on top of our static files, which we're already keeping in our S3 bucket. That could let users overwrite our static files, leaving us wide open to security problems.

What we want to do is either enforce always storing our static files and media files in different subdirectories of our bucket, or use two different buckets. I'll show how to use the different paths first.

In order for our STATICFILES_STORAGE to have different settings from our DEFAULT_FILE_STORAGE, they need to use two different storage classes; there's no way to configure anything more fine-grained in Django. So, we'll start by creating a custom storage class for our static file storage, by subclassing S3Boto3Storage. We'll also define a new setting, so we don't have to hardcode the path in our Python code.

For our example, we'll create a file custom_storages.py in our top-level project directory, that is, in the same directory as manage.py:

# custom_storages.py
from django.conf import settings
from storages.backends.s3boto3 import S3Boto3Storage

class StaticStorage(S3Boto3Storage):
    location = settings.STATICFILES_LOCATION

Then in our settings:

STATICFILES_LOCATION = 'static'
STATICFILES_STORAGE = 'custom_storages.StaticStorage'

STATICFILES_LOCATION is a new setting that we've created so that our new storage class can be configured separately from other storage classes in Django.

Giving our class a location attribute of 'static' will put all our files into paths on S3 starting with 'static/'.

You should be able to run collectstatic again, restart your site, and then all of your static files should have '/static/' in their URLs. Now delete from your S3 bucket any files outside of '/static' (using the S3 console, or whatever tool you like).

We can do something very similar now for media files, adding another storage class in custom_storages.py:

class MediaStorage(S3Boto3Storage):
    location = settings.MEDIAFILES_LOCATION

and in settings:

MEDIAFILES_LOCATION = 'media'
DEFAULT_FILE_STORAGE = 'custom_storages.MediaStorage'

Now when a user uploads their avatar, it should go into '/media/' in our S3 bucket. When we display the image on a page, the image URL will include '/media/'.

Using different buckets

You could use different buckets for static and media files by adding a bucket_name attribute to your custom storage classes. You can see the whole list of available attributes by looking at the source for S3Boto3Storage.

Moving an existing site's media to S3

If your site already has user-uploaded files in a local directory, you'll need to copy them up to your media directory on S3. There are lots of tools available for doing this. If the command line is your thing, try the AWS CLI tools from Amazon. They worked for me.

Summary

Serving your static and media files from S3 requires getting several different parts working together. But it's worthwhile for a number of reasons:

  • S3 can probably serve your files more efficiently than your own server.
  • Using S3 saves the resources of your own server for more important work.
  • Having media files on S3 allows easier scaling by replicating your servers.
  • Once your files are on S3, you're well on the way to using CloudFront to serve them even more efficiently using Amazon's CDN service.

Now that you know how to use Amazon S3 to store static and media files with Django, find out how to get started with hosting Django sites on Amazon Elastic Beanstalk, or using AWS load balancing.

Download Shipping Faster: Django Team Improvements
blog comments powered by Disqus

Success!

You're already subscribed