OO, complexity and toons...

Alex Martelli aleax at aleax.it
Mon May 13 18:08:29 EDT 2002


Shagshag wrote:
        [snip]
Your problem looks like it's made to order for a relational database.  Don't
shoehorn into OO something that wants to be architected in a different
paradigm!

You can create a relational database describing your data (e.g. with the
new rc1 of gadfly -- other RDBMS's are interfaced to Python quite
similarly, but gadfly has the distinction of being written in Python 
itself):

import os, shutil
from gadfly import gadfly
from gadfly.store import StorageError

connection = gadfly()
if os.path.exists('prova'):
    shutil.rmtree('prova')
os.makedirs('prova')
connection.startup('carpto', 'prova')
cursor = connection.cursor()

tables = ( "toon (name VARCHAR, kind VARCHAR)",
           "cartoon (title VARCHAR)",
           "features (title VARCHAR, name VARCHAR, kind VARCHAR)",
         )
indices = ( "key_toon ON toon(name, kind)",
            "key_cartoon ON cartoon(title)",
            "key_features ON features(title, name, kind)",
          )
for table in tables: cursor.execute("CREATE TABLE "+table)
for index in indices: cursor.execute("CREATE UNIQUE INDEX "+index)
cursor.execute("INSERT INTO toon(name, kind) VALUES(?, ?)", [
    ('Daffy', 'duck'),
    ('Bugs', 'bunny'),
    ('Babs', 'bunny'),
    ('Mickey', 'mouse'),
    ('Minnie', 'mouse'),
    ('Bugs', 'bug'),
    ('Jerry', 'mouse'),
    ('Tom', 'cat'),
    ] )
cursor.execute("INSERT INTO cartoon(title) VALUES(?)", [
    ('title1',), ('title2',), ('title3',), ('titlem',) ] )
cursor.execute("INSERT INTO features(title, name, kind) VALUES(?, ?, ?)", [
    ('title1', 'Daffy', 'duck'),
    ('title1', 'Bugs', 'bunny'),
    ('title1', 'Babs', 'bunny'),
    ('title2', 'Mickey', 'mouse'),
    ('title2', 'Minnie', 'mouse'),
    ('title3', 'Bugs', 'bunny'),
    ('titlem', 'Jerry', 'mouse'),
    ('titlem', 'Tom', 'cat'),
    ] )

connection.commit()

Actually, you probably need only the 'features' table here, though the 
others may be useful when you need to add toons not featuring in any
cartoon, check relational integrity (not in gadfly, but in other RDBMSs),
etc.  You could add other indices (non-unique) on the features table --
this needs some benchmarking of typical queries vs data insertion, to
check which indices actually help performance, depending on your RDBMS.

Then you can query the database:

import gadfly

conn = gadfly.gadfly('carpto', 'prova')
curs = conn.cursor()

curs.execute("""
        SELECT title FROM features
        WHERE name<>'Bugs' AND kind='bunny'
    """)
print curs.pp()

curs.execute("""SELECT DISTINCT title FROM features 
    WHERE name='Bugs' AND kind='bunny' AND title IN (
        SELECT title FROM features
        WHERE name<>'Bugs' AND kind='bunny'
    )
    """)
print curs.pp()

'''
print curs.fetchall()
'''

This is the hardest one of the queries you asked for -- and, as you
see, a nested select makes short work of it.  The others are easier.


Alex




More information about the Python-list mailing list