Custom JOINs with Django's query.join()

Django's ORM is great. It handles simple to fairly complex queries right out the box without having to write any SQL. If you need a complicated query, Django's lets you use .extra(), and you can always fallback to raw SQL if need be, but then you lose the ORM's bells and whistles. So it's always nice to find solutions that allow you to tap into the ORM at different levels.

Recently, we were looking to perform a LEFT OUTER JOIN through a Many to Many relationship. For a lack of a better example, let's use a Contact model (crm_contact), which has many Phones (crm_phones):

class Contact(models.Model):
    name = models.CharField(max_length=255)
    phones = models.ManyToManyField('Phone')
    addresses = models.ManyToManyField('Address')

class Phone(models.Model):
    number = models.CharField(max_length=16)

If we want to display each contact and corresponding phone numbers, looping through each contact in Contact.objects.all() and following the phones relationship will generate quite a few database queries (especially with a large contact table). select_related() doesn't work in this scenario either, because it only supports Foreign Key relationships. We can use extra() to add a select parameter, but tables=['crm_phones'] will not generate a LEFT OUTER join type. We need to explicitly construct the JOIN.

DISCLAIMER: The following method does work, but should not be considered best practice. That is, there may be a better way to accomplish the same task (please comment if so!). But after sparse Google results for similar scenarios, I figured it'd at least be useful to post what we discovered.

After digging around in django.db.models.sql for a bit, we found BaseQuery.join in query.py. Among the possible arguments, the most important is connection, which is "a tuple (lhs, table, lhs_col, col) where 'lhs' is either an existing table alias or a table name. The join corresponds to the SQL equivalent of: lhs.lhs_col = table.col". Further, the promote keyword argument will set the join type to be a LEFT OUTER JOIN.

Now we can explicitly setup the JOINs through crm_contact -> crm_contact_phones -> crm_phone:

contacts = Contact.objects.extra(
    select={'phone': 'crm_phone.number'}
).order_by('name')

# setup intial FROM clause
# OR contacts.query.get_initial_alias()
contacts.query.join((None, 'crm_contact', None, None))

# join to crm_contact_phones
connection = (
    'crm_contact',
    'crm_contact_phones',
    'id',
    'contact_id',
)
contacts.query.join(connection, promote=True)

# join to crm_phone
connection = (
    'crm_contact_phones',
    'crm_phone',
    'phone_id',
    'id',
)
contacts.query.join(connection, promote=True)

It's a little verbose, but it accomplishes our goal. I used hardcoded table names/columns in the connection tuple to make it easier to follow, but we can also extract this information from the objects themselves:

contacts = Contact.objects.extra(
    select={'phone': 'crm_phone.number'}
).order_by('name')

# setup intial FROM clause
# OR contacts.query.get_initial_alias()
contacts.query.join((None, Contact._meta.db_table, None, None))

# join to crm_contact_phones
connection = (
    Contact._meta.db_table, # crm_contact
    Contact.phones.field.m2m_db_table(), # crm_contact_phones
    Contact._meta.pk.column, # etc...
    Contact.phones.field.m2m_column_name(),
)
contacts.query.join(connection, promote=True)

# join to crm_phone
connection = (
    Contact.phones.field.m2m_db_table(),
    Phone._meta.db_table,
    Contact.phones.field.m2m_reverse_name(),
    Phone._meta.pk.column,
)
contacts.query.join(connection, promote=True)

This results in a row for each phone number (Cartesian product), but we can print out each contact and corresponding phone numbers (with a single SQL statement) quickly in a template using {% ifchanged %}:

<h1>Contacts</h1>

<p>{% for contact in contacts %} {% ifchanged contact.name %}</p>

<h2>{{ contact.name }}</h2>

<p>{% endifchanged %}</p>

<p>Phone: {{ contact.phone }}</p>

<p>{% endfor %}</p>
New Call-to-action
blog comments powered by Disqus
Times
Check

Success!

Times

You're already subscribed

Times