[DB-SIG] New take on PostgreSQL bindings for Python

Wed Sep 6 10:42:33 CEST 2006

Cristian Gafton wrote:
> I have seen the recent discussions about the DB API 2.0 shortcomings 
> when it comes to working with modern databases. I tend to agree that the 
> current API serves more of a guideline - it specifies the bare minimum, on 
> top of which everybody keeps reinventing the same extensions.
> 
> So, what I would like to see covered by a future DB API spec:
> 
> - bind parameters. The current pythonesque %(name)s specification is not 
> ideal. It requires various levels of pain in escaping the arguments passed 
> to a query. Most database backends accept bind parameters, albeit with 
> different syntaxes. For code portability, I'd rather parse the SQL query 
> and rewrite it to use the proper element for the bound parameter than do 
> the crazy escaping on the user input that is currently being done.
> 
> (As a side note, my bindings use PostgreSQL's native support to do things 
> like cu.execute("select * from foo where id = $1", x), without having to 
> worry about escaping x)

I'm not sure I understand your comment on escaping things - you
normally pass the statement (with the binding parameter markers)
and the binding parameters separately to the database.

This allows the database to create an query plan for the statement
and then apply the parameters to this query plan one or more times.

The main benefit is that you don't have to do any escaping in the
SQL statement, which as a side-effect, also prevent the typical
SQL injection vulnerabilities.

> - parsed statements. On large loops this is a real gain. For lack of a 
> better specification, I currently use something like:
>  	prepcu = db.prepare("select * from foo where id = $1")
>  	prepcu.execute(2)
> The prepared cursor statement has the same fetchone(), fetchall() and 
> other properties of the regular cursors, except it only accepts parameters 
> to its execute() and executemany() calls.

The DB API specifies that the driver should try to cache the
prepared statement based on the statement string.

Since in some cases, you may need to do the prepare
step without actually executing anything, I've added
the following in mxODBC which is in line with the DB API
spec:

cursor.prepare(command)
    Prepare the statement for execution and set
    the cursor.command attribute to command.

The programmer can then pass cursor.command to the
.executexxx() methods, e.g.

cursor.execute(cursor.command, params)

which the interface will notice and then use the prepared
statement.

> - server side cursors. Currently, most bindings for most databases have to 
> decide what to do after an cursor.execute() call - do they automatically 
> retrieve all the resulting rows in the client's memory, or do they 
> retrieve it row by row, pinging the server before every retrieval to get 
> more data (hey, not everybody using Oracle ;^). DB API has no support 
> for controlling this in a consistent fashion, even though Python has 
> solved the issue of dict.items() vs dict.iteritems() a long time ago. 
> The application writers should have a choice on how the cursors will 
> behave.
> 
> (Again, in my bindings, I have added support for db.iteritems() to get a 
> cursor that will retrieve rows from the server side in chunks instead of 
> all at once. I left db.cursor() to return a cursor which will download all 
> results in the client memory after an execute - which seems to be the 
> prevailing default)

Server side cursors vs. client side cursors is usually something
that's implemented and managed by the database driver - why should
the Python programmer have to think about this detail ?

The Python programmer can use .fetchone() or .fetchmany() to
indicate whether she wants to read rows in chunks or one-by-one.
The Python interface can then map these requests to whatever
the database driver has to offer.

Something that's missing from the DB API spec is a way to
define the cursor's name. In mxODBC I've added an optional
name parameter to connection.cursor([name]) which predefines
the name of the cursor.

While cursors usually automatically get a name assigned by
the database, it is sometimes useful to know this name in
advance and then use server side cursors by explicitly coding
the SQL statements to refer to the opened cursor, e.g. for
updates based on the cursor position.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 06 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::