database persistence with mysql, sqlite

Bryan Olson fakeaddress at nowhere.org
Thu Sep 27 02:15:49 EDT 2007


Lawrence D'Oliveiro wrote:
> Bryan Olson wrote:
> 
>> Lawrence D'Oliveiro wrote:
>>> In Bryan Olson wrote:
>>>
>>>> coldpizza wrote:
>>>>> It turned out that the method above ('SELECT * FROM TABLE LIMIT L1,
>>>>> L2') works ok both with mysql and sqlite3, therefore I have decided to
>>>>> stick with it until I find something better. With Sqlite3 you are
>>>>> supposed to use LIMIT 10 OFFSET NN, but it also apparently supports
>>>>> the mysql syntax (LIMIT NN, 10) for compatibility reasons.

>>>> A more reliable form is along the lines:
>>>>
>>>>      SELECT keyfield, stuff
>>>>      FROM table
>>>>      WHERE keyfield > ?
>>>>      ORDER BY keyfield
>>>>      LIMIT 10
>>>>
>>>> With the right index, it's efficient.

>>> But that involves keeping track of the right starting keyfield value for
>>> the next batch of records, which is complicated and nontrivial.

>> We write the link so that the browser will send back the parameter we
>> need. If the largest keyfield value on the page is
>> "Two-Sheds" the link might read:
>>
>> <A HREF="http://rfh.uk/tablnext.cgi?start=Two-Sheds">Next</A>
> 
> That's assuming keyfield is a) unique, 

Exactly; that was was the idea behind the name choice. The
method extends to multi-column keys, so it is generally
applicable.

> and b) a relevant ordering for
> displaying to the user.

That's a nice-to-have, but not required.

>> Keeping a cursor with pending data across HTTP requests is
>> a world of hurt.
> 
> "limit offset, count" avoids all that.

It can be stateless, but then it is unreliable. Here's an
example with Python 2.5:


import sqlite3

db = sqlite3.connect(":memory:")

# Simple table, an integer and the hex for that integer
db.execute(
   "CREATE TABLE numbers (num INTEGER PRIMARY KEY, hex TEXT)")

# Start with 10-29 in the table
for i in range(10, 30):
     db.execute("INSERT INTO numbers VALUES (?, ?)", (i, hex(i)))


# Print 4 records starting at offset
def next4(offset):
     cur = db.execute(
       "SELECT * FROM numbers LIMIT 4 OFFSET ?",
       (offset,))
     for x in cur:
         print x


# Walk the table with LIMIT and OFFSET

next4(0)   # Good, prints 10-13
next4(4)   # Good, prints 14-17

# Another transaction inserts new records
for i in range(0, 4):
     db.execute("INSERT INTO numbers VALUES (?, ?)", (i, hex(i)))

next4(8)   # Bad, prints 14-17 again

# Another transaction deletes records
for i in range(0, 4):
     db.execute("DELETE FROM numbers WHERE num = ?", (i,))

next4(12)  # Bad, we're missing 18-21



The method I advocated is still not the same as doing the
whole thing in a serializable transaction, but it will
return any record that stays in the table the whole time,
and will not return any record multiple times.


-- 
--Bryan



More information about the Python-list mailing list