[DB-SIG] DB-API 1.1

Greg Stein (Exchange) gstein@exchange.microsoft.com
Sun, 14 Jun 1998 20:26:38 -0700


Some historical rationale:

> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> Sent: Thursday, June 04, 1998 2:24 AM
> 
> Ok, I'll try to summarize what's been proposed so far [and my
> comments on them ]:
> 
> Ted Horst would like to see dictionaries used for connection
> parameters and column descriptions.
> ...
> Jim wanted to know how what kind of exception to raise for interface
> related errors. I suggested InterfaceError with Error as base class.
> ...
> Stored Procedures. As expected this causes troubles.
> 
> 	AFAIK, it is possible to call stored procedures via the
> 	standard execute method. Input parameters get passed in
> 	via the parameter tuple, output goes into a result set.
> 
> 	Since databases tend to use different calling syntaxes
> 	the callproc() could take care of mapping the procedure
> 	to the database specific syntax and then pass the parameters
> 	on to the execute method to have the procedure call itself
> 	executed. Results would then be available via the
> 	result set. IN OUT parameters are not possible using this
> 	scheme, but then: which Python type would you use for them
> 	anyway...

The differences between calling stored procs and the "normal" use of execute
is why the two were separated. If there is a way to weasel execute() into
doing it for you... great. callproc() was always a hand-wave.

> Things to be considered obsolete:
> 	arraysize
> 	setinputsize()
> 	setoutputsize()
> 	
> 	Completely agree on punting these... fetchmany() would then
> 	have a mandatory argument instead of an optional one.

See my other note. These should be considered optional, but should not be
punted. Python can be used for high-performance DB mgmt operations (I proved
this back in 96, against a guy who took two days to implement in C++
something that I did in 15 minutes with comparable execution speed).

> begin() method. Jim mentioned this without too many details attached
> to it. Michael responded with a transaction model.
> 
> 	Michaels model is easily implemented in Python on top
> 	of database cursors. As are some other techniques of getting
> 	the OO look&feel into database handling. This would
> 	well define a new project for the DB-SIG.

Many of these higher level operations were omitted from the design. The
intent was to allow module implementors to expose "enough" so that Python
modules could do the rest.
[ the dtuple module is a good example of punting useful functionality up a
level ]

> Return values for DDL, DML and DQL statements.
> 
> 	The only really useful return value defined is the
> 	number of rows affected for DMLs. mxODBC provides two
> 	new attributes for cursors:
> 	rowcount - number of rows affected or in the result set
> 	colcount - number of columns in the result set
> 	We drop all the special return values and add these two
> 	(or maybe just rowcount) attributes instead.

Having the execute() function return an integer allows the execute() to be
an atomic operation (for DMLs). Placing return values into attributes breaks
this.

I don't recall the origin of the 1 for a DDL statement. You'd have to ask
Lorton for that one...

> Multiple results sets.
> 
> 	Same problem as with stored procedures. There doesn't
> 	seem to be much of a standard and most DBs probably don't
> 	even support these.

I think these snuck in with 1.1; I don't recall the nextset() method at all.

> Mapping of database types to Python types.
> 
> 	Since different databases support different types (including
> 	some really exotic ones like PostgreSQL), I think the
> 	general idea should be: preserve as much accuracy as you
> 	can e.g. if a money databse type doesn't fit Python's integers,
> 	use longs. The only type debatable, IMHO, is what to use
> 	for long varchar columns (LONGS or BLOBS). The most common
> 	usage for these is probably storing raw data, just like you
> 	would in a disk file. Two Python types come into play here:
> 	arrays (mutable) and strings (immutable). Strings provide
> 	a much better access interface and are widely supported, so
> 	I'd vote for strings.

Raw columns were returned as dbiRaw instances so that they could be fed
right back into an INSERT statement. In the dbiRaw.value, you would find
that native string object, however.

The thing that we did want to avoid, however, is an exhaustive mapping
within the API definition (currencies and the like). These would probably
follow the dbi model, but each database handles these speciality types a bit
differently. It becomes very cumbersome on the module implementor if you try
to unify these types within the API. We felt it was a bit better to punt
unification up to the Python level if it was required.

> Having coltype compare '==' instead of 'is'. Bill doesn't like it.
> ...
> dbiDate. I suggested using DateTime types, Bill agrees, Jim probably
> has his own set of types ;-) which he'd like to use.
> 
> 	I'm biased, of course, but the DateTime type are readily
> 	available and easy to use on the user as on the interface
> 	programmer side. They provide a rich set of API functions
> 	which they export via a CObject (meaning: no linking problems).
> 	mxODBC uses them already. Though it also allows you to
> 	choose two other way of passing date/time values: as strings
> 	and as tuples.

This wasn't around when the DBAPI first came up. 'nuf said :-)

> In general having more of the API return dedicated types with a
> nice OO interface would be a really nice thing, but I fear that
> most interface programmers are not willing to hack up two or
> three other types in addition to connections and cursors.

Major, big-time point. We had three modules that we wanted to build
ourselves, not to mention modules that others wanted to build. Keeping the
module slim and "close to the metal" was easier on us, and more in tune with
Python's viewpoint on exposing APIs thru modules (e.g. expose the natural
interface, then place Python layers on that as necessary).

-g