May 26 2009 by Tobias McNulty
At Caktus, we rely heavily on automated testing for web app development. We create tests for all the code we write, ideally before the code is written. We create tests for every bug we find and, resources permitting, ramp up the test suite with lots of random input and boundary testing.
Debugging concurrency issues or race conditions has long been a nightmare. There are only so many times you can double click the link in your web app that is generating some bizarre failure.
Using the Django test client, I created a little decorator that you can use in your unit tests to make sure a view doesn't blow up when it's called multiple times with the same arguments. If it does blow up, and you happen to be using PostgreSQL, chances are you can fix the issues by using Colin's previously posted require_lock decorator.
Here's the decorator for testing concurrency:
def test_concurrently(times):
"""
Add this decorator to small pieces of code that you want to test
concurrently to make sure they don't raise exceptions when run at the
same time. E.g., some Django views that do a SELECT and then a subsequent
INSERT might fail when the INSERT assumes that the data has not changed
since the SELECT.
"""
def test_concurrently_decorator(test_func):
def wrapper(*args, **kwargs):
exceptions = []
import threading
def call_test_func():
try:
test_func(*args, **kwargs)
except Exception, e:
exceptions.append(e)
raise
threads = []
for i in range(times):
threads.append(threading.Thread(target=call_test_func))
for t in threads:
t.start()
for t in threads:
t.join()
if exceptions:
raise Exception('test_concurrently intercepted %s exceptions: %s' % (len(exceptions), exceptions))
return wrapper
return test_concurrently_decorator
To use this in a test, create a small function that includes the thread-safe code inside your test. Apply the decorator, passing the number of times you want to run the code simultaneously, and then call the function:
class MyTestCase(TestCase):
def testRegistrationThreaded(self):
url = reverse('toggle_registration')
@test_concurrently(15)
def toggle_registration():
# perform the code you want to test here; it must be thread-safe
# (e.g., each thread must have its own Django test client)
c = Client()
c.login(username='user@example.com', password='abc123')
response = c.get(url)
toggle_registration()
May 26 2009 by Colin Copeland
By default, Django doesn't do explicit table locking. This is OK for most read-heavy scenarios, but sometimes you need guaranteed, exclusive access to the data. Caktus uses PostgreSQL in most of our production environments, so we can use the various lock modes it provides to control concurrent access to the data. Once we obtain a lock in PostgreSQL, it is held for the remainder of the current transaction. Django provides transaction management, so all we need to do is execute a SQL LOCK statement within a transaction, and Django and PostgreSQL will handle the rest.
Below is an example decorator we came up with to provide easy table-locking access in Django:
from django.db import transaction
LOCK_MODES = (
'ACCESS SHARE',
'ROW SHARE',
'ROW EXCLUSIVE',
'SHARE UPDATE EXCLUSIVE',
'SHARE',
'SHARE ROW EXCLUSIVE',
'EXCLUSIVE',
'ACCESS EXCLUSIVE',
)
def require_lock(model, lock):
"""
Decorator for PostgreSQL's table-level lock functionality
Example:
@transaction.commit_on_success
@require_lock(MyModel, 'ACCESS EXCLUSIVE')
def myview(request)
...
PostgreSQL's LOCK Documentation:
http://www.postgresql.org/docs/8.3/interactive/sql-lock.html
"""
def require_lock_decorator(view_func):
def wrapper(*args, **kwargs):
if lock not in LOCK_MODES:
raise ValueError('%s is not a PostgreSQL supported lock mode.')
from django.db import connection
cursor = connection.cursor()
cursor.execute(
'LOCK TABLE %s IN %s MODE' % (model._meta.db_table, lock)
)
return view_func(*args, **kwargs)
return wrapper
return require_lock_decorator
This is, by no means, a perfect solution. Feel free to comment below.
May 26 2009 by Tobias McNulty
There's currently no way to accept microsecond-precision input through a Django form's DateTimeField. This is an acknowledged bug, but the official solution might not come very soon, because the real fix is non-trivial.
In the meantime, here's one approach that will work in most cases:
class DateTimeWithUsecsField(forms.DateTimeField):
def clean(self, value):
if value and '.' in value:
value, usecs = value.rsplit('.', 1) # rsplit in case '.' is used elsewhere
usecs += '0'*(6-len(usecs)) # right pad with zeros if necessary
try:
usecs = int(usecs)
except ValueError:
raise ValidationError('Microseconds must be an integer')
else:
usecs = 0
cleaned_value = super(DateTimeWithUsecsField, self).clean(value)
if cleaned_value:
cleaned_value = cleaned_value.replace(microsecond=usecs)
return cleaned_value
To use this in a model form, you can override the field like so:
class MyForm(forms.ModelForm):
def __init__(self, *arg, **kwargs):
super(MyForm, self).__init__(*arg, **kwargs)
self.fields['date'] = DateTimeWithUsecsField()
May 25 2009 by Tobias McNulty
In preparation for migrating the EveryWatt database from one machine to another, I wrote this little WSGI script to easily disable the site while I copy the data. Since it doesn't depend on Django or really anything else (other than a functioning WSGI server), you can use it for other upgrades, too.
This is useful for preventing updates to the database while you, for example, dump the database on one machine and load it on another. With everything else already in place on either side, the user should only see the "Upgrade in progress" message for a few minutes.
Since EveryWatt includes a number of data logger clients that upload utility meter readings to the site through its Open API, I wanted to make sure any POST attempts received a temporary failure message (the data logger will store the data and retry the POST every minute)--hence the 405 Method Not Allowed for all non-GET requests.
Here's the script:
import os
import sys
UPGRADING = False
#Calculate the project path based on the location of the WSGI script.
project_dir = os.path.dirname(__file__)
sys.path.append(project_dir)
def upgrade_in_progress(environ, start_response):
upgrade_file = os.path.join(project_dir, 'media', 'html', 'upgrade.html')
if os.path.exists(upgrade_file):
response_headers = [('Content-type','text/html')]
response = open(upgrade_file).read()
else:
response_headers = [('Content-type','text/plain')]
response = 'Application upgrade in progress...please check back soon.'
if environ['REQUEST_METHOD'] == 'GET':
status = '503 Service Unavailable'
else:
status = '405 Method Not Allowed'
start_response(status, response_headers)
return [response]
if UPGRADING:
application = upgrade_in_progress
else:
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
And in case you need it, here's one way to dump a PostgreSQL database on one machine while you load it on another, to be run on the new host, as the database superuser:
Good luck and please post your questions/comments.
May 21 2009 by Tobias McNulty
I finally got around to updating my Eclipse, PyDev, and Subclipse environment today, which I use for Django development.
Formerly I was using the SvnKit (pure-Java) libraries. SvnKit "felt" slow to me, compared to my command line SVN client, so this time I tried to get the JavaHL (JNI) libraries working.
For the record I'm using Ubuntu (jaunty) with Eclipse 3.4 (Ganymede). This version of Ubuntu comes with Subversion 1.5, so I need to install Subclipse 1.4. See:
http://subclipse.tigris.org/servlets/ProjectProcess?pageID=p4wYuA
I installed everything through the Eclipse update manager (minus SvnKit), but JavaHL didn't show up under Preferences -> Team -> SVN. The error message was. JavaHL (JNI) not available.
I had installed Eclipse manually (not through apt-get), so the solution was to install the JavaHL libraries:
apt-get install libsvn-java
and add the following line to my eclipse.ini (usually in the top level eclipse directory):
-Djava.library.path=/usr/lib/jni
Restart Eclipse, and you should be good to go!