Django and Hudson CI (Day 1)

March 8th, 2010 by Colin Copeland

We’re always looking for new tools to make our development environment more robust here at Caktus. We write a lot of tests to ensure proper functionality as new features land and bug fixes are added to our projects. The next step is to integrate with a continuous integration system to automate the process and regularly check that status of the build.

After attending Dr. C. Titus Brown’s “Why not run all your tests all the time? A study of continuous integration systems.” talk at Pycon and seeing Django’s Hudson setup, I figured I’d take a look at Hudson CI.

Installing Hudson and basic setup

Hudson is very easy to setup. I started with a fresh Ubuntu 9.10 install on the smallest Rackspace cloud instance and had it running after a few commands. I followed the Debian setup instructions, which basically consists of:

$ wget -O - http://hudson-ci.org/debian/hudson-ci.org.key | sudo apt-key add -
$ echo "deb http://hudson-ci.org/debian binary/" >> /etc/apt/sources.list
$ apt-get update
$ aptitude install hudson
$ apt-get upgrade

That’s it! It’s already up and running on port 8080 using it’s own web server. Go ahead and pull it up in your browser.

As a test, let’s setup django-crm (a Caktus open-source community project) as our first Hudson job. Click “New Job”, type in a job name, click “Build a free-style software project”, and hit OK. django-crm contains a sample project that we’ll use to run the test suite. On the job configuration page, check Subversion in the Source Code Management section and type in the Repository URL:

Click Save, run the job by clicking “Build Now”, and check out the Console Output:

Started by user anonymous
Checking out a fresh workspace because /var/lib/hudson/jobs/django-crm/workspace/sample_project doesn't exist
Checking out http://django-crm.googlecode.com/svn/trunk/sample_project
A         manage.py
A         site_media
A         site_media/css
A         site_media/css/jquery.autocomplete.css
A         site_media/css/django-contactinfo.css
A         site_media/js
A         site_media/js/jquery-ui-1.7.2.custom.min.js
A         site_media/js/jquery-1.3.2.min.js
A         site_media/js/django-crm.js
A         site_media/js/jquery.autocomplete.min.js
...
Finished: SUCCESS

Cool, now let’s run some tests. Took keep things simple, let’s grab Django and a few dependencies using aptitude:

$ wget http://www.djangoproject.com/download/1.1.1/tarball/
$ tar xzvf Django-1.1.1.tar.gz
$ cd Django-1.1.1
$ sudo python setup.py install
$ aptitude install python-dev python-imaging python-setuptools python-pip

To run the tests, add an “Execute shell” build step in the Build section with this command:

#!/bin/bash -ex
cd sample_project
python manage.py test crm

Run the job again and look for the test results in the console output:

[workspace] $ /bin/sh -xe /tmp/hudson6670261053226891793.sh
+ cd sample_project
+ python manage.py test crm
...
Finished: SUCCESS

XML Test output

To integrate Hudson with the Django test suite, I used unittest-xml-reporting. Just “pip install unittest-xml-reporting” and add the following lines to your settings file:

TEST_RUNNER = 'xmlrunner.extra.djangotestrunner.run_tests'
TEST_OUTPUT_VERBOSE = True
TEST_OUTPUT_DESCRIPTIONS = True
TEST_OUTPUT_DIR = 'xmlrunner'

Then check “Publish JUnit test result report” in the Post-build Actions section and add the path to the test XML output “sample_project/xmlrunner/*.xml”:

Run the job and you should see a new “Test Result” link in the navigation. Now you can view the test results right in your browser window.

Coverage

To add coverage reports, I used Ned Batchelder’s coverage.py (pip install coverage). Navigate to Hudson’s plugin manager (Hudson -> Manage Hudson -> Manage Plugins), install the Cobertura Plugin, and restart Hudson when prompted. Then modify your shell script like so:

#!/bin/bash -ex
cd sample_project
coverage run manage.py test crm
coverage xml --omit=/usr/

This will generate an XML coverage report in the working directory, so we just need to tell Hudson where to look for it. Check “Publish Cobertura Coverage Report” in the Post-build Actions section and enter the path to the report:

Run the build again and you should have access to a new “Coverage Report” link.

More to come…

This was just a simple example of getting Hudson setup with a Django project and I know a lot more can be done with Hudson (check out the large number of available plugins). The top items on my todo list are: see Hudson setup environments with virtualenv and pip, integrate more closely with the test suite (possibly using nose), check for PEP compliance, and setup build failure notifications. I hope to write more as I continue to setup our Hudson environment!

References

A few useful Hudson/Python/Django links I discovered while running through this setup:

Caktus Sends Team of Five to PyCon 2010 in Atlanta

February 17th, 2010 by tobias

Python and Django are tools we use on a daily basis to build fantastic web apps here at Caktus. I’m pleased to announce that Caktus is sending five developers–Colin, Alex, Mike, Mark, and myself–to PyCon 2010! PyCon is an annual gathering for users and developers of the open source Python programming language. This year the US conference is being held in Atlanta, GA. We’ll be driving down tomorrow (Thursday) from Chapel Hill, NC and staying for the conference weekend plus one day of the sprints.

I am attending PyCon

Hope to see you there!

Caktus Consulting Group hosts Django sprint in Triangle, NC area

December 6th, 2009 by tobias

Django is a tool we use every day to build rock-solid web apps here at Caktus, and a development sprint is a concerted, focused period of time in which developers meet in the same space to get things done on a project.

We’re proud to annouce that Caktus is hosting a local Django development sprint in the Triangle (Raleigh, Durham, and Chapel Hill/Carrboro) area of North Carolina. The sprint will be held the weekend of December 12th and 13th in Carrboro Creative Coworking, and the purpose of this sprint will be to help finish features and push out bug fixes in preparation for the upcoming Django 1.2 release.

If you’re interested in attending, no previous experience contributing to Django is necessary and the sprint will be a great opportunity to start. Work on other open source Django-based projects is welcome too. For more information, check out the corresponding wiki page and don’t forget to register for the event.

We’ll be there to open the doors at 9am both days. Courtesy of our sponsors there will be free drinks, snacks, and lunch to go around. Hope to see you there!

Custom JOINs with Django’s query.join()

September 28th, 2009 by Colin Copeland

Django’s ORM is great. It handles simple to fairly complex queries right out the box without having to write any SQL. If you need a complicated query, Django’s lets you use .extra(), and you can always fallback to raw SQL if need be, but then you lose the ORM’s bells and whistles. So it’s always nice to find solutions that allow you to tap into the ORM at different levels.

Recently, we were looking to perform a LEFT OUTER JOIN through a Many to Many relationship. For a lack of a better example, let’s use a Contact model (crm_contact), which has many Phones (crm_phones):

class Contact(models.Model):
    name = models.CharField(max_length=255)
    phones = models.ManyToManyField('Phone')
    addresses = models.ManyToManyField('Address')
 
class Phone(models.Model):
    number = models.CharField(max_length=16)

If we want to display each contact and corresponding phone numbers, looping through each contact in Contact.objects.all() and following the phones relationship will generate quite a few database queries (especially with a large contact table). select_related() doesn’t work in this scenario either, because it only supports Foreign Key relationships. We can use extra() to add a select parameter, but tables=['crm_phones'] will not generate a LEFT OUTER join type. We need to explicitly construct the JOIN.

DISCLAIMER: The following method does work, but should not be considered best practice. That is, there may be a better way to accomplish the same task (please comment if so!). But after sparse Google results for similar scenarios, I figured it’d at least be useful to post what we discovered.

After digging around in django.db.models.sql for a bit, we found BaseQuery.join in query.py. Among the possible arguments, the most important is connection, which is “a tuple (lhs, table, lhs_col, col) where ‘lhs’ is either an existing table alias or a table name. The join corresponds to the SQL equivalent of: lhs.lhs_col = table.col”. Further, the promote keyword argument will set the join type to be a LEFT OUTER JOIN.

Now we can explicitly setup the JOINs through crm_contact -> crm_contact_phones -> crm_phone:

contacts = Contact.objects.extra(
    select={'phone': 'crm_phone.number'}
).order_by('name')
 
# setup intial FROM clause
# OR contacts.query.get_initial_alias()
contacts.query.join((None, 'crm_contact', None, None))
 
# join to crm_contact_phones
connection = (
    'crm_contact',
    'crm_contact_phones',
    'id',
    'contact_id',
)
contacts.query.join(connection, promote=True)
 
# join to crm_phone
connection = (
    'crm_contact_phones',
    'crm_phone',
    'phone_id',
    'id',
)
contacts.query.join(connection, promote=True)

It’s a little verbose, but it accomplishes our goal. I used hardcoded table names/columns in the connection tuple to make it easier to follow, but we can also extract this information from the objects themselves:

contacts = Contact.objects.extra(
    select={'phone': 'crm_phone.number'}
).order_by('name')
 
# setup intial FROM clause
# OR contacts.query.get_initial_alias()
contacts.query.join((None, Contact._meta.db_table, None, None))
 
# join to crm_contact_phones
connection = (
    Contact._meta.db_table, # crm_contact
    Contact.phones.field.m2m_db_table(), # crm_contact_phones
    Contact._meta.pk.column, # etc...
    Contact.phones.field.m2m_column_name(),
)
contacts.query.join(connection, promote=True)
 
# join to crm_phone
connection = (
    Contact.phones.field.m2m_db_table(),
    Phone._meta.db_table,
    Contact.phones.field.m2m_reverse_name(),
    Phone._meta.pk.column,
)
contacts.query.join(connection, promote=True)

This results in a row for each phone number (Cartesian product), but we can print out each contact and corresponding phone numbers (with a single SQL statement) quickly in a template using {% ifchanged %}:

<h1>Contacts</h1>
 
{% for contact in contacts %}
    {% ifchanged contact.name %}
        <h2>{{ contact.name }}</h2>
    {% endifchanged %}
    <p>Phone: {{ contact.phone }}</p>
{% endfor %}

Web Developer for Hire

September 23rd, 2009 by Colin Copeland

We’re pleased to announce that Caktus is looking for a developer to join our team on a contract basis!

What do we do? We build custom web applications for local and remote clients using a variety of open-source technologies. We are a small team founded in the Chapel Hill/Carrboro area (currently residing in Carrboro Creative Coworking) who believe in face-to-face contact and employ agile development techniques that emphasize teamwork and collaboration.

We’re looking for a strong software developer who enjoys working on a team and is excited to learn and experiment with new technologies. We do have a preference for local candidates, but will consider all submissions. Initial work will focus on maintaining small Django-powered websites. This will involve HTML/CSS (including converting Photoshop designs), Django Templates, and writing Unit Tests. Later work will involve creating and integrating Django apps into larger projects, deployment, and database work.

You will be working in Linux (Debian-flavor) production environments with Apache and WSGI. Python/Django experience is not required, but will be used on a daily basis. Relational database experience is a must. HTML/CSS and JavaScript experience are also a must, and jQuery is a plus.

If you’re interested in this position, please send us your resume, some example code, links to any open-source projects you’ve contributed to, and expected compensation. We’re excited to bring on a new team member!

Open Source Django Projects from Caktus Consulting Group

September 7th, 2009 by tobias

At Caktus we’re big fans of reusing code. We leverage many open source projects–especially Django apps–to accomplish a variety of tasks. In addition, we’ve written quite a few pluggable apps over the paste two years that we reuse over and over again for different projects. As a way of giving back to the community, we’ve polished and released a portion of that code as open source ourselves. While some of the projects have been available on Google Code for awhile now, we just put together a consolidated list of open source Django projects on our web site to serve as a jumping off point for all the projects we like, we contributed to, and we created. Enjoy!

Caktus Consulting Group, LLC sponsors DjangoCon 2009

September 5th, 2009 by tobias

Django is a tool we use on a daily basis to build fantastic web apps here at Caktus, and DjangoCon is the annual conference for Django developers and other community members. We are proud to announce that Caktus Consulting Group, LLC is sponsoring DjangoCon 2009!

This year, the conference is being held the week of September 7th in the beautiful city of Portland, Oregon. Two Caktus partners, Colin and myself, will be attending. We hope to see you there!

Creating recursive, symmetrical many-to-many relationships in Django

August 14th, 2009 by tobias

In Django, a recursive many-to-many relationship is a ManyToManyField that points to the same model in which it’s defined (’self’). A symmetrical relationship is one in where, when a.contacts = [b], a is in b.contacts.

In changeset 8136, support for through models was added to the Django core. This allows you to create a many-to-many relationship that goes through a model of your choice:

class Contact(models.Model):
    contacts = models.ManyToManyField(
        'self',
        through='ContactRelationship',
        symmetrical=False,
    )
 
 
class ContactRelationship(models.Model):
    types = models.ManyToManyField(
        'RelationshipType',
        related_name='contact_relationships',
        blank=True,
    )
    from_contact = models.ForeignKey('Contact', related_name='from_contacts')
    to_contact = models.ForeignKey('Contact', related_name='to_contacts')
 
    class Meta:
        unique_together = ('from_contact', 'to_contact')

According to the Django Docs, you must set symmetrical=False for recursive many-to-many relationships. Sometimes–for a recent case in django-crm, for example–what you really want is a symmetrical, recursive many-to-many relationship.

The trick to getting this working is understanding what symmetrical=True actually does. From what we can tell after a brief look through the Django core, symmetrical=True is simply a utility that (a) creates a second, reverse relationship in the many-to-many table, and (b) hides the field in the related model (in this case the same model) from use by appending a ‘+’ to its name.

Since you normally have to create many-to-many relationships manually when a through model is specified, the solution is simply to leave symmetrical=False (otherwise it’ll raise an exception) and create the reverse relationship manually yourself via the through model:

crm.ContactRelationship.objects.create(
    from_contact=contact_a,
    to_contact=contact_b,
)
crm.ContactRelationship.objects.create(
    from_contact=contact_b,
    to_contact=contact_a,
)

Additionally, you’ll have to do a little cleanup to make sure both sides of the relationship are removed when one is removed, but otherwise this should achieve the same effect as setting symmetrical=True in other many-to-many relationships.

To hide the other side of the related manager, you can append a ‘+’ to the related_name, like so:

class Contact(models.Model):
    contacts = models.ManyToManyField(
        'self',
        through='ContactRelationship',
        symmetrical=False,
        related_name='related_contacts+',
    )

Good luck and feel free to comment with any questions!

Setting PostgreSQL’s SHMMAX in Mac OS X 10.5 (Leopard)

August 13th, 2009 by Colin Copeland

If you’ve ever tried to increase the shared_buffers setting in your postgresql.conf to a value that exceeds the amount of shared memory supported by your operating system kernel, then you’ll see an error message like this:

copelco@montgomery:~$ /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data
2009-07-10 10:14:04 EDTFATAL:  could not create shared memory segment: Invalid argument
2009-07-10 10:14:04 EDTDETAIL:  Failed system call was shmget(key=5432001, size=142516224, 03600).
2009-07-10 10:14:04 EDTHINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter.  You can either reduce the request size or reconfigure the kernel with larger SHMMAX.  To reduce the request size (currently 142516224 bytes), reduce PostgreSQL's shared_buffers parameter (currently 16384) and/or its max_connections parameter (currently 23).
	If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
	The PostgreSQL documentation contains more information about shared memory configuration.

The shared_buffers default value is low (for legacy reasons). If you increase it, PostgreSQL may request a shared memory segment that exceeds your kernel’s SHMMAX paramter. You can see the current values like so:

copelco@montgomery:~$ sysctl kern.sysv.shmmax
kern.sysv.shmmax: 4194304
copelco@montgomery:~$ sysctl kern.sysv.shmall
kern.sysv.shmall: 1024

17.4. Managing Kernel Resources outlines methods to set the values permanently, but you can play around with the values temporarily (until restart) on the command line like so:

copelco@montgomery:~$ sudo sysctl -w kern.sysv.shmmax=1073741824
kern.sysv.shmmax: 4194304 -> 1073741824
copelco@montgomery:~$ sudo sysctl -w kern.sysv.shmall=1073741824
kern.sysv.shmall: 1024 -> 1073741824

Once you have working values, you can fire up PostgreSQL (I’ve been happy with the kyngchaos distribution) with a LaunchDaemon file and launchd:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>org.postgresql.postgres</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/pgsql/bin/postmaster</string>
        <string>-D</string>
        <string>/usr/local/pgsql/data</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>UserName</key>
    <string>copelco</string>
</dict>
</plist>

And the launchd commands:

copelco@montgomery:~$ sudo launchctl unload /Library/LaunchDaemons/org.postgresql.postgres.plist
copelco@montgomery:~$ sudo launchctl load /Library/LaunchDaemons/org.postgresql.postgres.plist

Towards a Standard for Django Session Messages

June 19th, 2009 by tobias

Django needs a standard way in which session-specific messages can be created and retrieved for display to the user. For years we’ve been surviving using user.message_set to store messages that are really specific to the current session, not the user, or using the latest and greatest Django snippet, pluggable app, or custom crafted middleware to handle messages in a more appropriate way.

While this has been discussed at length in Ticket #4604 as well as on Django Snippets, here are a few reasons that user.message_set is the wrong implementation:

  • No message_set exists for AnonymousUsers in Django, so you can’t display any messages to them.
  • What happens when the same user is logged in from two different browsers and completing two different tasks, simultaneously? When using user.message_set to store feedback for the user, the messages will be distributed on a first come first served basis, with no regard for what session actually generated what feedback. For this reason it’s bad to get in the habit of using user.message_set for messages like “Article updated successfully,” or other messages that really have no context outside the current session.

I’ve outlined a few characteristics below that I believe would make up a solid session messaging contrib app. Please feel free to comment if I missed anything, or if you’ve got beef with any of my points. This is in many ways a work in progress, so I’ll update it as often as I can.

  • Standards. The implementation ought to make it clear how multiple messages are to be stored and retrieved for display to the user. Maybe you need to push multiple messages onto the stack from a single view, or your app performs multiple redirects through different views.
  • Persistence. In the case where your app redirects through multiple views, it’s not acceptable for session messages to disappear. The implementation needs to provide facilities for determining whether or not the messages were actually displayed, and delay purging the message list if necessary.
  • Flexibility. Support the case where a large number of independent, pluggable apps do messaging in the same project (sometimes for the same request), but don’t require it. Display all the messages created by all the apps, but don’t break (or lose messages) if one of the apps doesn’t happen to use the messaging implementation.
  • Efficiency. Avoid storing messages in the database (or another persistent store) if possible. While it’s possible to use memcache as a session backend, this isn’t always possible. One potential implementation would be to store shorter messages directly in a cookie, but provide a fallback to session-based storage for longer messages.

Here’s the implementation we use at Caktus, which is far from complete but it does address some of these points. This code is based on a number of snippets as well as attachments to the above referenced ticket. It could be improved by purging each message independently when it is actually retrieved and adding facilities for cookie-based storage. While I haven’t used it yet, django-notify looks a lot better than this and I’m excited about trying it out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
from django.utils.encoding import StrAndUnicode
from django.contrib.sessions.backends.base import SessionBase
 
MESSAGES_NAME = '_messages'
 
SessionBase.get_messages = lambda self: self[MESSAGES_NAME]
 
def _session_get_and_delete_messages(self):
    messages = self.pop(MESSAGES_NAME, [])
    self[MESSAGES_NAME] = []
    return messages
SessionBase.get_and_delete_messages = \
  _session_get_and_delete_messages
 
def _session_create_message(self, message):
    self[MESSAGES_NAME].append(message)
    self.modified = True
SessionBase.create_message = _session_create_message
 
class SessionMessagesMiddleware(object):
    """
    To store messages or other user feedback in the session, add this
    class to your middleware.
 
    In your views, call request.session.create_message('the message') to
    add a message to the session.
 
    In your template(s), do this:
 
        {% if request.messages %}
            {% for message in request.messages %}<li>{{ message|escape }}</li>{% endfor %}
        {% endif %}
 
    Messages will NOT be erased from the session if you never access request.messages.
    """
 
    class LazyMessages(StrAndUnicode):
        """
        A lazy proxy for session messages.
        """
        def __init__(self, session):
            self.session = session
            super(SessionMessagesMiddleware.LazyMessages, self).__init__()
 
        def __iter__(self):
            return iter(self.messages)
 
        def __len__(self):
            return len(self.messages)
 
        def __nonzero__(self):
            return bool(self.messages)
 
        def __unicode__(self):
            return unicode(self.messages)
 
        def __getitem__(self, *args, **kwargs):
            return self.messages.__getitem__(*args, **kwargs)
 
        def _get_messages(self):
            if not hasattr(self, '_messages'):
                self._messages = self.session.get_and_delete_messages()
            return self._messages
        messages = property(_get_messages)
 
    def process_request(self, request):
        if not hasattr(request, 'session'):
            raise AttributeError('Request has no attribute "session".  Make sure session middleware is running before SessionMessages middleware.')
 
        if MESSAGES_NAME not in request.session:
            request.session[MESSAGES_NAME] = []
 
        request.messages = \
          SessionMessagesMiddleware.LazyMessages(request.session)