Keeping python code and database in sync

Chris Angelico rosuav at gmail.com
Fri Aug 29 09:01:50 EDT 2014


On Fri, Aug 29, 2014 at 10:42 PM, Frank Millman <frank at chagford.com> wrote:
> It is a simple matter to write a program that updates the database
> automatically. The question is, what should trigger such an update? My first
> thought is to use a version number - store a version number in the working
> directory, and have a matching number in the code. If someone downloads the
> latest version, the numbers will no longer match, and I can run the upgrade
> program.

This is a well-known problem, and there's no really perfect solution.
The first thing to consider is: What happens if someone back-levels
the program? If you can afford to say "never back-level past a schema
change", then you can simply version the schema, independently of the
code. A simple incrementing integer will do - you don't need a
multipart version number. Then you just have code like this:

# Get the current schema version, or 0 if there's nothing yet
version = db.query("select schema_version from config")
if version < 1:
    # Brand new database
    db.query("create table blah blah")
    db.query("create table spam")
    # etc
if version < 2:
    db.query("alter table spam add whatever")

    # Add new patch levels here
    db.query("update config set schema_version = 2")
else:
    throw_really_noisy_error("YOU BACKLEVELLED!")

To add a new patch level, you add a new condition with the next
number, add its code, and change the update statement at the end. So
it'd look like this:

 if version < 2:
     db.query("alter table spam add whatever")
+if version < 3:
+    db.query("create table brand_new_table")

     # Add new patch levels here
-    db.query("update config set schema_version = 2")
+    db.query("update config set schema_version = 3")
else:
     throw_really_noisy_error("YOU BACKLEVELLED!")

It's fairly straight-forward and readable. You'll sometimes need to go
back and defang old patch code (if you simplify or break stuff), and
you might prefer to keep your patch 0 handling up-to-date (so it
doesn't then have to do all the rest of the patches - have that one
immediately set version and bail out), but that's a basic structure
that's been proven in real-world usage. (Note that the exact code
above might be buggy. I'm recreating from memory and porting to Python
at the same time. But the design intent is there.)

Ideally, you want to minimize churn. Don't do heaps of schema changes
in a short period of time. But this can handle plenty of changes
fairly easily, and it'll handle either incremental changes or big
blocks of them just the same way (if you upgrade from patch level 10
to patch level 35 all at once, it'll just grind through all those
changes one after another).

ChrisA



More information about the Python-list mailing list