Is it just me, or is Sqlite3 goofy?

Magnus Lycka lycka at carmen.se
Wed Sep 13 05:19:04 EDT 2006


First of all, anyone with extensive experience in database systems
understand that validating and cleaning input is an unavoidable task.

Static typing can help identify some of the problems, but far from
all, and there is often data processing done before the data enters
the database, so it's often too late to do the cleaning at that time
anyway.

Once you are dealing with data within your system, it's a bug in the
software if data doesn't fulfill the intended constraints. Such
problems should be identified by unit tests.

Mike Owens wrote:
> Next, as far as transferring you data, you most likely have to resort
> to some delimited format, or INSERT statements, which is no different
> than any other database.

You can always write a Python script which reads from one
database, cleans up the data and inserts into another, one
row at a time. This isn't just a porting activity. I've written
a Python app that moved data from a DB2 mainframe production
database to DB2 testing database on Windows. It moved data
from a hierarchy of tables, starting in a given table, and
reading foreign keys from system tables (very slow in mainframe
DB2!) to figure out what data to bring from which tables.

Since the system used abstract keys and moved data to a populated
database, it had to generate new keys and adjust the foreign keys
in the dependent tables. It also had to work around bugs and
quirks in Windows NT, DB2 and ADODBAPI etc. These things are
relatively easy, but it's never trivial. There are always a lot
of details to deal with.

For big volumes, you typically need to use some kind of bulk
loading facility. Inserts are generally much too slow. (I'm
talking about general data porting woes here--in the case of
SQLite it's not likely that you have many million rows of data.)

Efficient bulk loading means that you have different data format
for different database systems, and also that you need to validate
your data before insertion, so the problems directly related to
SQLite doesn't seem very odd.



More information about the Python-list mailing list