Is it just me, or is Sqlite3 goofy?

Mon Sep 4 21:15:42 EDT 2006

Probably just me. I've only been using Access and SQL Server
for 12 years, so I'm sure my opinions don't count for anything.

I was, nevertheless, looking forward to Sqlite3. And now
that gmpy has been upgraded, I can go ahead and install
Python 2.5.

So I open the manual to Section 13.13 where I find the first
example of how to use Sqlite3:

<code>
conn = sqlite3.connect(':memory:')

c = conn.cursor()

# Create table
c.execute('''create table stocks
(date timestamp, trans varchar, symbol varchar,
 qty decimal, price decimal)''')

# Insert a row of data
c.execute("""insert into stocks
          values ('2006-01-05','BUY','RHAT',100,35.14)""")
</code>

Seems pretty simple, yet I was befuddled and bewildered
by this example. So much so that I added a query to see exactly
what was going on.

<code>
# added by me
c.execute('select * from stocks')
d = c.fetchone()
for t in  d:
    print type(t),t
</code>

Original code - what's wrong with this picture?
    <type 'unicode'> 2006-01-05
    <type 'unicode'> BUY
    <type 'unicode'> RHAT
    <type 'int'> 100
    <type 'float'> 35.14

Why is the date returning as a string? Aren't the built in
converters supposed to handle that? Yes, if you enable
detect_types.

Added detect_types=sqlite3.PARSE_DECLTYPES

Traceback (most recent call last):
  File "C:\Python25\sqlite_first_example.py", line 30, in <module>
    c.execute('select * from stocks')
  File "C:\Python25\lib\sqlite3\dbapi2.py", line 66, in
convert_timestamp
    datepart, timepart = val.split(" ")
ValueError: need more than 1 value to unpack

Aha, that explains why they weren't enabled.

This failed because
 - the value inserted was wrong format?
 - and the builtin converter can't split it cuz it has no spaces?
when it worked it was because
 - detect_types was not set, so converter not invoked when queried?

Yes, the format in the example was datetime.date and the field type
should have been cast [date], not [timestamp] which needs HH:MM:SS
for the converter to work properly (but only if detect_types
enabled).

If a correct format was inserted, converter would have worked

  <type 'datetime.datetime'> 2006-09-04 13:30:00
  <type 'unicode'> BUY
  <type 'unicode'> RHAT
  <type 'int'> 100
  <type 'float'> 35.14

Or, had the field been properly cast as [date] instead of [timestamp]
it would also have worked.

  <type 'datetime.date'> 2006-09-04
  <type 'unicode'> BUY
  <type 'unicode'> RHAT
  <type 'int'> 100
  <type 'float'> 35.14

Ah, this now partly explains the original result, since detect_types
is off by default, the field cast, not being a native sqlite3 type
was ignored and the data queried back as TEXT.

  <type 'unicode'> 2006-09-04
  <type 'unicode'> BUY
  <type 'unicode'> RHAT
  <type 'int'> 100
  <type 'float'> 35.14

Ok, next issue, what the fuck are [varchar] and [decimal]?
They are certainly not Sqlite3 native types. If they are
user defined types, where are the converters and adapters?
Does Sqlite3 simply ignore a cast that isn't registered?

Note that although both qty and price were cast as [decimal]
they queried back as int and float.

Note also that it's "obvious" from the query example on page 13.13
that the example is FUBAR
 - the date is a unicode string, not a datetime.date type
 - the price is binary floating point, not decimal

<quote>
    This example uses the iterator form:
    >>> c = conn.cursor()
    >>> c.execute('select * from stocks order by price')
    >>> for row in c:
    ...    print row
    ...
    (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
    (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
    (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
    (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
</quote>

Here we have an example of things apparently working
for the wrong reason. A classic example of the programmer
who *thinks* he knows how it works because he wrote it.
This kind of sloppiness wouldn't last 5 minutes in a production
environment.

But why did Sqlite3 make qty an int and price a float?
Hard to say since THE FURTHER EXAMPLES IN THE DOCS don't
even bother to cast the field types. I guess I'm supposed
to guess how things are supposed to work. Can I trust that
default settings will be what I want?

Ha! Can I trust the baby with a hammer?

First, note that the script example in section 13.13.3

<quote>
import sqlite3

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.executescript("""
    create table person(
        firstname,
        lastname,
        age
    );

     create table book(
        title,
        author,
        published
    );

    insert into book(title, author, published)
    values (
        'Dirk Gently''s Holistic Detective Agency
        'Douglas Adams',
        1987
    );
    """)
</quote>

contains not one but TWO syntax errors! A single quote after
the word Agency is missing as is the comma that should be at
the end of that line. Seems that no one actually ever tried
this example.

That's easily fixed. But I was curious about why the fields
don't have type casts. After the previous debacle and
knowing that this code was never tested, I am not going to
assume it works. Better add a query to make sure.

cur.execute("select title, author, published from book")
d = cur.fetchall()
for i in d: print i
print

(u"Dirk Gently's Holistic Detective Agency", u'Douglas Adams', 1987)

Ok, so if not cast, the fields must default (and probably also when
a cast is made that hasn't been defined).

But watch this: being clueless (but not stupid) is a gift I have
for troubleshooting. I tried (incorrectly) to insert another record:

cur.execute("insert into book(title, author, published) values ('Dirk
Gently''s Holistic Detective Agency','Douglas Adams','1987')")

(u"Dirk Gently's Holistic Detective Agency", u'Douglas Adams', 1987)
(u"Dirk Gently's Holistic Detective Agency", u'Douglas Adams', u'1987')

Uhh...how can a database have a different field type for each record?

Simple, without a cast when the table is created, the field type is
whatever you insert into it. That's how the default must work,
each record has a data structure independent of every other record!

Wow. Just think of the kind of bugs *that* must cause.

Bugs?

Here's MY example, creating a Cartesian Product

<code>
import sqlite3
letters = [(2,),('10',),('20',),(200,)]
con = sqlite3.connect(":memory:")
con.text_factory = str
con.execute("create table letter(c integer)")
con.executemany("insert into letter(c) values (?)", letters)
print 'Queried: ',
for row in con.execute("select c from letter"):
    print row,
print
print
print 'Sorted: ',
for row in con.execute("select c from letter order by c"):
    print row[0],
print
print
print 'Cartesian Product: ',
for row in con.execute("select a.c, b.c, c.c from letter as a, letter
as b, letter as c"):
    print row[0]+row[1]+row[2],
</code>

Note that the list of data to be inserted contains both strings and
ints. But because the field was correctly cast as [integer], Sqlite3
actually stored integers in the db. We can tell that from how the
"order by" returned the records.

Queried:  (2,) (10,) (20,) (200,)

Sorted:  2 10 20 200

Cartesian Product:  6 14 24 204 14 22 32 212 24 32 42 222 204 212
222 402 14 22 32 212 22 30 40 220 32 40 50 230 212 220 230 410 24
32 42 222 32 40 50 230 42 50 60 240 222 230 240 420 204 212 222
402 212 220 230 410 222 230 240 420 402 410 420 600

Because if I cast them as [text] the sort order changes (and my
Cartesian Product becomes concatenation instead of summation).

Queried:  ('2',) ('10',) ('20',) ('200',)

Sorted:  10 2 20 200

Cartesian Product:  222 2210 2220 22200 2102 21010 21020 210200
2202 22010 22020 220200 22002 220010 220020 2200200 1022 10210
10220 102200 10102 101010 101020 1010200 10202 102010 102020
1020200 102002 1020010 1020020 10200200 2022 20210 20220 202200
20102 201010 201020 2010200 20202 202010 202020 2020200 202002
2020010 2020020 20200200 20022 200210 200220 2002200 200102
2001010 2001020 20010200 200202 2002010 2002020 20020200 2002002
20020010 20020020 200200200

But if I omit the cast altogether, then the db stores the input
exactly as it was inserted, so the c field contains both
text and integers wreaking havoc with my sort order, making
records un-queryable using "where" and causing my Cartesian
Product to crash.

Queried:  (2,) ('10',) ('20',) (200,)

Sorted:  2 200 10 20

Cartesian Product:  6

Traceback (most recent call last):
  File "C:\Python25\user\sqlite_test2.py", line 225, in <module>
    print row[0]+row[1]+row[2],
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Yeah, I know, I've heard it before.

"This behavior is by design."

It's still fuckin' goofy.