[melbourne-pug] django db race conditions

Brian May brian at microcomaustralia.com.au
Wed Oct 16 03:27:22 CEST 2013


Hello All,

I have a reasonable amount of Django code that follows this general model:

try:
   object = Model.objects.get(name="woof")
except Model.DoesNotExist:
   object = Model()
   init_object(update)
   object.save()

Or, in some cases:

object = Model.objects.get_or_create(name="woof")

In both cases the resultant code is very similar.

In both cases there is a race condition. Depending on the flow of
execution, I can end up with two or more db objects with name="woof". There
are many forum posts discussing this race condition.

As an example, for the first case happens when displaying a webpage. Lets
assume init_object() is relatively slow. As the web page takes a while to
load, the user clicks reload. This results in two (or more) objects being
created with name="woof" in error.

Another example, for the second case occurs when a JavaScript app makes
concurrent calls to the web service.

Some people have suggested that if I I want name to be unique, I should
make it a database constraint. However that is not always the case that I
want these values to be strictly unique, I just want to reuse an existing
entry or create it if it doesn't exist. Also, the database constraint would
mean the code fails instead of committing two objects, which is not really
helpful.

Other people have suggested locking the db table, while doing the
get_or_create. Seems to require possible db specific SQL code, am I bit
reluctant to do this.

Django's select_for_update method is interesting, however as the object
doesn't actually exist yet, not really applicable.

Another solution I have considered, at least for some cases, is
moving init_object to a celery task. This would provide the user with
faster feedback as to what is happening, and for some slow tasks is
probably a good thing.  Ideally I would only want one task to initialize
the object, not sure how I would check this without introducing new race
conditions very similar to the one I am trying to remove. e.g.:

if task not created:
    create task

In theory create task could be called multiple times.

Another solution, that would work in some places is to make sure that the
object exists by some other means beforehand. So I can safely do a get
instead of a get_or_create.

Any other ideas?

Quite possibly I will have to try and find a solution on a case by case
basis :-(.

Shame we didn't realize this before we wrote this code.
-- 
Brian May <brian at microcomaustralia.com.au>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/melbourne-pug/attachments/20131016/60eef39e/attachment.html>


More information about the melbourne-pug mailing list