From carl at personnelware.com  Sun Apr  1 22:52:13 2007
From: carl at personnelware.com (Carl Karsten)
Date: Sun, 01 Apr 2007 15:52:13 -0500
Subject: [DB-SIG] dbf memo access
Message-ID: <46101B7D.90000@personnelware.com>

I need to read (not write) some dbf memo data (memo is dBase's datatype to store 
'unlimited' text, much like a blob or text field.)  to make things worse, 
various implementations of the dbf engine have different formats for storing 
memos - I forget what the dBaseIII file name was, but I currently need VFP's, 
which is filename.FPT (fox pro text)

The few hits I got on google didn't make it clear if they supported any sort of 
dbf memo, let alone VFP's.

I do most of my work in linux, so hoping for something other than the odbc way, 
but I can use that if it is my only choice.

Carl K

From mal at egenix.com  Sun Apr  1 23:43:46 2007
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 01 Apr 2007 23:43:46 +0200
Subject: [DB-SIG] dbf memo access
In-Reply-To: <46101B7D.90000@personnelware.com>
References: <46101B7D.90000@personnelware.com>
Message-ID: <46102792.1050003@egenix.com>

On 2007-04-01 22:52, Carl Karsten wrote:
> I need to read (not write) some dbf memo data (memo is dBase's datatype to store 
> 'unlimited' text, much like a blob or text field.)  to make things worse, 
> various implementations of the dbf engine have different formats for storing 
> memos - I forget what the dBaseIII file name was, but I currently need VFP's, 
> which is filename.FPT (fox pro text)
> 
> The few hits I got on google didn't make it clear if they supported any sort of 
> dbf memo, let alone VFP's.
> 
> I do most of my work in linux, so hoping for something other than the odbc way, 
> but I can use that if it is my only choice.

If you only need to do this once, then using the MS FoxPro ODBC on Windows
is the best and easiest way to extract the data.

Some other references that might help:

Read dBase3 in Python:
    http://cwashington.netreach.net/depo/view.asp?Index=102&ScriptType=python

Read dBase and xBase files:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715

Not sure whether those two help with memos.

Cheers,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 01 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From carl at personnelware.com  Mon Apr  2 01:01:50 2007
From: carl at personnelware.com (Carl Karsten)
Date: Sun, 01 Apr 2007 18:01:50 -0500
Subject: [DB-SIG] dbf memo access
In-Reply-To: <46102792.1050003@egenix.com>
References: <46101B7D.90000@personnelware.com> <46102792.1050003@egenix.com>
Message-ID: <461039DE.5090600@personnelware.com>

M.-A. Lemburg wrote:
> On 2007-04-01 22:52, Carl Karsten wrote:
>> I need to read (not write) some dbf memo data (memo is dBase's datatype to store 
>> 'unlimited' text, much like a blob or text field.)  to make things worse, 
>> various implementations of the dbf engine have different formats for storing 
>> memos - I forget what the dBaseIII file name was, but I currently need VFP's, 
>> which is filename.FPT (fox pro text)
>>
>> The few hits I got on google didn't make it clear if they supported any sort of 
>> dbf memo, let alone VFP's.
>>
>> I do most of my work in linux, so hoping for something other than the odbc way, 
>> but I can use that if it is my only choice.
> 
> If you only need to do this once, then using the MS FoxPro ODBC on Windows
> is the best and easiest way to extract the data.
> 
> Some other references that might help:
> 
> Read dBase3 in Python:
>     http://cwashington.netreach.net/depo/view.asp?Index=102&ScriptType=python
> 

             elif type == 'M':
                 # We ignore the memo field
                 pass


> Read dBase and xBase files:
>     http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715
> 

             M for ascii character memo data (real memo fields not supported)


Looks like I am  going to have to figure out the odbc thing.  Thanks for 
confirming that I wasn't missing something.

Carl K

From phd at phd.pp.ru  Wed Apr 11 17:37:15 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 11 Apr 2007 19:37:15 +0400
Subject: [DB-SIG] SQLObject 0.7.5
Message-ID: <20070411153715.GC21003@phd.pp.ru>

Hello!

I'm pleased to announce the 0.7.5 release of SQLObject.

What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.7.5

News and changes:
http://sqlobject.org/docs/News.html


What's New
==========

News since 0.7.4
----------------

* Fixed a bug in DateValidator caused by datetime being a subclass of date.

* Fixed test_deep_inheritance.py - setup classes in the correct order
  (required for Postgres 8.0+ which is strict about referential integrity).

For a more complete list, please see the news:
http://sqlobject.org/docs/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From phd at phd.pp.ru  Wed Apr 11 17:53:19 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 11 Apr 2007 19:53:19 +0400
Subject: [DB-SIG] SQLObject 0.8.2
Message-ID: <20070411155319.GD21492@phd.pp.ru>

Hello!

I'm pleased to announce the 0.8.2 release of SQLObject.


What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Development:
http://sqlobject.org/devel/

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.8.2

News and changes:
http://sqlobject.org/News.html


What's New
==========

News since 0.8.1
----------------

* Fixed ConnectionHub.doInTransaction() - if the original connection was
  processConnection - reset processConnection, not threadConnection.

* Fixed a bug in DateValidator caused by datetime being a subclass of date.

* Fixed test_deep_inheritance.py - setup classes in the correct order
  (required for Postgres 8.0+ which is strict about referential integrity).

For a more complete list, please see the news:
http://sqlobject.org/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From phd at phd.pp.ru  Wed Apr 11 18:27:18 2007
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 11 Apr 2007 20:27:18 +0400
Subject: [DB-SIG] SQLObject 0.9.0b1
Message-ID: <20070411162718.GD22672@phd.pp.ru>

Hello!

I'm pleased to announce the 0.9.0b1 release of SQLObject, the first beta of
the upcoming 0.9 release.


What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Development:
http://sqlobject.org/devel/

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.9.0b1

News and changes:
http://sqlobject.org/News.html


What's New
==========

News since 0.8
--------------

Features & Interface
--------------------

* Support for Python 2.2 has been declared obsolete.

* Removed actively deprecated attributes;
  lowered deprecation level for other attributes to be removed after 0.9.

* SQLite connection got columnsFromSchema(). Now all connections fully support
  fromDatabase. There are two version of columnsFromSchema() for SQLite -
  one parses the result of "SELECT sql FROM sqlite_master" and the other
  uses "PRAGMA table_info"; the user can choose one over the other by using
  "use_table_info" parameter in DB URI; default is False as the pragma is
  available only in the later versions of SQLite.

* Changed connection.delColumn(): the first argument is sqlmeta, not
  tableName (required for SQLite).

* SQLite connection got delColumn(). Now all connections fully support
  delColumn(). As SQLite backend doesn't implement "ALTER TABLE DROP COLUMN"
  delColumn() is implemented by creating a new table without the column,
  copying all data, dropping the original table and renaming the new table.

* Versioning_.

.. _Versioning: Versioning.html

* MySQLConnection got new keyword "conv" - a list of custom converters.

* Use logging if it's available and is configured via DB URI.

* New columns: TimestampCol to support MySQL TIMESTAMP type;
  SetCol to support MySQL SET type;
  TinyIntCol for TINYINT; SmallIntCol for SMALLINT;
  MediumIntCol for MEDIUMINT; BigIntCol for BIGINT.

Small Features
--------------

* Support for MySQL INT type attributes: UNSIGNED, ZEROFILL.

* Support for DEFAULT SQL attribute via defaultSQL keyword argument.

* Support for MySQL storage ENGINEs.

* cls.tableExists() as a shortcut for conn.tableExists(cls.sqlmeta.table).

* cls.deleteMany(), cls.deleteBy().

Bug Fixes
---------

* idName can be inherited from the parent sqlmeta class.

For a more complete list, please see the news:
http://sqlobject.org/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From unixdude at gmail.com  Tue Apr 17 03:26:57 2007
From: unixdude at gmail.com (Jim Patterson)
Date: Mon, 16 Apr 2007 21:26:57 -0400
Subject: [DB-SIG] dbf memo access
Message-ID: <d8b213cb0704161826l17165e07q9461a84c3b223f14@mail.gmail.com>

Carl,

In the past, I needed to read a FoxPro file and the following worked:
http://www.garshol.priv.no/download/software/python/dbfreader.p<http://www.garshol.priv.no/download/software/python/dbfreader.py>
y <http://www.garshol.priv.no/download/software/python/dbfreader.py>

Your milage may vary,
Jim Patterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070416/240f3aa8/attachment.html 

From unixdude at gmail.com  Tue Apr 17 06:05:16 2007
From: unixdude at gmail.com (Jim Patterson)
Date: Tue, 17 Apr 2007 00:05:16 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
Message-ID: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>

All,

Over on the cx_Oracle list we have been discussing adding support
for returning native Unicode strings and decimal objects.  We have
so far been talking about using a settable attribute on the connection
and the cursor with the cursor inheriting the value from the connection
by default.  The is very similar to the existing technique used
by cx_Oracle for the "numbersAsString" and the technique used
by mxODBC for the "stringFormat" and "datetimeFormat".

Anyone have any thoughts/feelings/opinions about moving towards
standardizing how we do this kind of thing across the different
database modules?

Thanks in advance,
Jim Patterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070417/a72b0878/attachment.html 

From fog at initd.org  Tue Apr 17 09:44:19 2007
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 17 Apr 2007 09:44:19 +0200
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
Message-ID: <1176795859.3630.6.camel@mila>

Il giorno mar, 17/04/2007 alle 00.05 -0400, Jim Patterson ha scritto:
> Over on the cx_Oracle list we have been discussing adding support
> for returning native Unicode strings and decimal objects.  We have
> so far been talking about using a settable attribute on the connection
> and the cursor with the cursor inheriting the value from the
> connection
> by default.  The is very similar to the existing technique used
> by cx_Oracle for the "numbersAsString" and the technique used
> by mxODBC for the "stringFormat" and "datetimeFormat". 
> 
> Anyone have any thoughts/feelings/opinions about moving towards
> standardizing how we do this kind of thing across the different
> database modules? 

psycopg's type system is one of its best features (and one loved by
users I was told). At any time you can create a new "type" as

nt = psycopg2.new_type((oid1, oid2, ...), "name", typecast_func)

and then register it using "psycopg2.register_type(nt)". This has 2
effects:

     1. data described by listed oids (this is PostgreSQL-specific, I
        know) is converted using the function "typecast_func"; and
     2. you can use "nt" as a type object in comparaisons, just like
        other type object in the dbapi (STRING, NUMERIC, etc...)

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
  Se consideri l'uso del software libero una concessione tu stesso,
   come potrai proporla agli altri?                         -- Nick Name
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/72dee7c3/attachment.pgp 

From mal at egenix.com  Tue Apr 17 10:16:46 2007
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 17 Apr 2007 10:16:46 +0200
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
Message-ID: <4624826E.5080703@egenix.com>

On 2007-04-17 06:05, Jim Patterson wrote:
> All,
> 
> Over on the cx_Oracle list we have been discussing adding support
> for returning native Unicode strings and decimal objects.  We have
> so far been talking about using a settable attribute on the connection
> and the cursor with the cursor inheriting the value from the connection
> by default.  The is very similar to the existing technique used
> by cx_Oracle for the "numbersAsString" and the technique used
> by mxODBC for the "stringFormat" and "datetimeFormat".
> 
> Anyone have any thoughts/feelings/opinions about moving towards
> standardizing how we do this kind of thing across the different
> database modules?

While mxODBC does use this kind of approach, I don't think
it's all that flexible, e.g. we have now added a new attribute
.decimalformat for specifying whether you want floats or
decimals for decimal database columns.

Ideally, it should be possible to set converters for all
kinds of output types as well as ones for input parameters.

In some situations, it's also desirable to be able to do this
based on the output variable or parameter position. This would,
of course, only apply to cursors with already prepared statements.

While this can be solved using a registry of types conversions,
I see problems in standardizing the way to define the type
mappings since different database backends tend to have
or need different types.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 17 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From fog at initd.org  Tue Apr 17 10:29:34 2007
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 17 Apr 2007 10:29:34 +0200
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <4624826E.5080703@egenix.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com>
Message-ID: <1176798574.3630.26.camel@mila>

Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto:
> 
> While this can be solved using a registry of types conversions,
> I see problems in standardizing the way to define the type
> mappings since different database backends tend to have
> or need different types. 

I can see an API that leverages on the introspection abilities of the
drivers, to abstract to the different type representations of the
various backends. Let's suppose that a driver "knows" the type of a DB
column, then we can ask it for an abstract "dbtype":

dbtype = connection_object.getdbtype("SELECT 1 AS foo")

where the query _must_ return a scalar from which the driver infers the
type. Then the type can be used as a key in the registry. Obviously the
conversion function will be backend-specific but I suppose the signature
couldbe the same for all functions. Given the fact that the conversion
happens inside a cursor and than the connection is available from the
cursor object itself, something like:

py_data = conversion_function(backend_data, cursor_object)

Then we can at least make a standard for the registry methods.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
              All programmers are optimists. -- Frederick P. Brooks, Jr.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/f1865f52/attachment.pgp 

From aprotin at research.att.com  Tue Apr 17 16:51:38 2007
From: aprotin at research.att.com (Art Protin)
Date: Tue, 17 Apr 2007 10:51:38 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1176798574.3630.26.camel@mila>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>	<4624826E.5080703@egenix.com>
	<1176798574.3630.26.camel@mila>
Message-ID: <4624DEFA.30501@research.att.com>

Folks,
    This conversation is excellent.

I have also experienced the need to extend our driver in non-standard 
ways in support
of datatypes (as well as in other dimensions.)  I added an attribute to 
cursors .datetime_type
to indicate whether to leave them as strings (as converted by our DBMS 
on output) or to
convert them to datetime objects, indicated by the values 'string' and 
'object' respectively.
The cursor objects inherit their initial setting of this attribute from 
the attribute
.default_datetime_type on the driver module.

I also added another attribute on the cursor which, after any query, has 
a list of strings,
one per column, with the type names as they were reported by the DBMS.


Federico Di Gregorio wrote:

>Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto:
>  
>
>>While this can be solved using a registry of types conversions,
>>I see problems in standardizing the way to define the type
>>mappings since different database backends tend to have
>>or need different types. 
>>    
>>
>
>I can see an API that leverages on the introspection abilities of the
>drivers, to abstract to the different type representations of the
>various backends. Let's suppose that a driver "knows" the type of a DB
>column, then we can ask it for an abstract "dbtype":
>
>dbtype = connection_object.getdbtype("SELECT 1 AS foo")
>
>where the query _must_ return a scalar from which the driver infers the
>type.
>
I do not see reasons for (1) why this is a connection level method, and 
(2) why the
query would need to be limited to returning a scalar.  This seems to be 
getting the
same information that my second extension provides.  What am I missing here?

> Then the type can be used as a key in the registry.
>
Yes, something nice and simple, like a dict using the string name of the 
DBMS native
datatype as the index.  However, this might not work out after all.  Our 
database
system has an nearly unbounded set of types.  The types have two 
components, say a
major and minor, or a main type and subtype.  The main type "STRING" alone
has 65535 subtypes (one for each allowable size).  Other main types may 
a few
subtypes or even no subtypes  Some of the subtypes make a major 
difference in
the conversion function behavior (like those for DATE) and some make 
nearly none.
My conversion routines are called based on the main type but need both 
the data
value and the subtype as arguments.  Do any of the other systems have
such a multi-level type scheme as this?

> Obviously the
>conversion function will be backend-specific but I suppose the signature
>couldbe the same for all functions. Given the fact that the conversion
>  
>
>happens inside a cursor and than the connection is available from the
>cursor object itself, something like:
>
>py_data = conversion_function(backend_data, cursor_object)
>
>Then we can at least make a standard for the registry methods.
>
>federico
>
>  
>

However, another issue is discovering the available conversion functions
and determining their arguments.

    Thank you all,
    Art Protin

>------------------------------------------------------------------------
>
>_______________________________________________
>DB-SIG maillist  -  DB-SIG at python.org
>http://mail.python.org/mailman/listinfo/db-sig
>  
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070417/fcfadaf3/attachment.html 

From fumanchu at amor.org  Tue Apr 17 19:35:10 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 17 Apr 2007 10:35:10 -0700
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1176798574.3630.26.camel@mila>
Message-ID: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local>

Federico Di Gregorio wrote:
> Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto:
> > 
> > While this can be solved using a registry of types conversions,
> > I see problems in standardizing the way to define the type
> > mappings since different database backends tend to have
> > or need different types. 
> 
> I can see an API that leverages on the introspection abilities of the
> drivers, to abstract to the different type representations of the
> various backends.

Perhaps, but "type" and "representation" are two different concepts. For
example, I've got an SQL Server DB (whose schema I can't change) which
stores most dates in proper DATETIME columns, but there are 3 or 4 which
store dates in CHAR(8) columns in 'YYYYMMDD' format. Although dbtype may
*imply* pytype and vice-versa, there will always be cases where the
adaptation layer between the two is context-dependent. Solving for
arbitrary SQL (like "WHERE mytable.yyyymmdd_birthdate > now()")
therefore means:

 1. Knowing the desired Python type for each column (datetime.date),
 2. Knowing the actual database type for each column or subexpression
(SQL Server's CHAR type),
 3. Having inbound and outbound scalar transformers
(datetime.date-to-YYYYMMDD and datetime.date-from-YYYYMMDD),
 4. Knowing which binary and comparison operations have implicit
conversions, and
 5. Special-casing binary and comparison operations between types which
have no implicit conversions.

For example, my YYYYMMDD adapter/converter has the method:

    def compare_op(self, op1, op, sqlop, op2):
        if isinstance(op2.dbtype, sqlserver.DATETIME):
            # Cast the YYYYMMDD string to a DATETIME.
            sql = ("((CASE WHEN ISDATE(%s)=1 "
                          "THEN CAST(%s AS DATETIME) "
                          "ELSE NULL END) %s %s)"
                   % (op1.sql, op1.sql, sqlop, op2.sql))
            return sql
        return "(%s %s %s)" % (op1.sql, sqlop, op2.sql)
    binary_op = compare_op


This is where Geniusql is headed. I'm not for a moment saying the DBAPI
should go that far, but there needs to be a clear understanding of
exactly how far the DBAPI is going to go down this rabbit hole (because
however far you go, your user base will forever pester you for the next
level of flexibility ;).

> Let's suppose that a driver "knows" the type of a DB
> column, then we can ask it for an abstract "dbtype":
> 
> dbtype = connection_object.getdbtype("SELECT 1 AS foo")
> 
> where the query _must_ return a scalar from which the driver 
> infers the type. Then the type can be used as a key in the
> registry.

Given the extremely small number of datatypes that each commercial
database exposes, this seems to be both more work and less accurate
results than simply modeling each concrete dbtype directly. All
SQL92-compliant types can be fully described with a handful of
attributes (bytes, precision, scale, whether each of those is
user-specifiable, and if so the maximum allowed value for each, whether
a numeric type is signed or unsigned, and finally the CHAR vs VARCHAR
distinction). [1]

> Obviously the conversion function will be backend-
> specific but I suppose the signature could be the same for
> all functions. Given the fact that the conversion happens
> inside a cursor and than the connection is available from the
> cursor object itself, something like:
> 
> py_data = conversion_function(backend_data, cursor_object)
> 
> Then we can at least make a standard for the registry methods.

There should be some provision for custom converters (which forces you
to stick the converters on each column object instead of in a registry,
since there can be several different converters for e.g.
datetime.date-to-CHAR).

But even if you decide not to go that far, the registry of default
converters will need to be keyed by (pytype, backend-specific dbtype).
For example, Postgres has a hard time comparing FLOAT4 and FLOAT8 [2],
not to mention that the concrete precision of SQL92 REAL and DOUBLE are
"implementation defined". It's not entirely hopeless; some base classes
for converters can be constructed [3].


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

[1] See
http://projects.amor.org/geniusql/browser/trunk/geniusql/dbtypes.py for
my mostly-finished crack at this, plus any module in
http://projects.amor.org/geniusql/browser/trunk/geniusql/providers for
concrete DB types. Note I stick default_pytype directly on both abstract
and concrete dbtype objects, but an external registry would be just as
easy.
[2] ...because the implicit conversion isn't always what you want; see
http://archives.postgresql.org/pgsql-bugs/2004-02/msg00062.php for an
example.
[3] See
http://projects.amor.org/geniusql/browser/trunk/geniusql/adapters.py

From mike_mp at zzzcomputing.com  Tue Apr 17 19:53:31 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Tue, 17 Apr 2007 13:53:31 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <4624DEFA.30501@research.att.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>	<4624826E.5080703@egenix.com>
	<1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com>
Message-ID: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>


On Apr 17, 2007, at 10:51 AM, Art Protin wrote:

> Yes, something nice and simple, like a dict using the string name  
> of the DBMS native
> datatype as the index.  However, this might not work out after  
> all.  Our database
> system has an nearly unbounded set of types.  The types have two  
> components, say a
> major and minor, or a main type and subtype.  The main type  
> "STRING" alone
> has 65535 subtypes (one for each allowable size).  Other main types  
> may a few
> subtypes or even no subtypes  Some of the subtypes make a major  
> difference in
> the conversion function behavior (like those for DATE) and some  
> make nearly none.
> My conversion routines are called based on the main type but need  
> both the data
> value and the subtype as arguments.  Do any of the other systems have
> such a multi-level type scheme as this?
>

im not following this thread so closely, but SQLAlchemy does have a  
configurable type system which can represent both the "major" type as  
you call it, plus any number of arguments for each type (which youd  
call the "minor" type), for any given result set column.  The "major"  
part is represented by the particular subclass of TypeEngine used,  
such as SLDateTime ( a date time type as represented in SQLite), and  
the "minor" part by the state of that particular TypeEngine instance  
(such as the length of a string column, or its encoding).

Of course SQLAlchemy is a significant layer on top of DBAPI, so as  
far as the "registry"-like functionality of what types map to what  
columns, its achieved via the presence of Table objects which are  
comprised of collections of Columns each with their own TypeEngine  
instance.   if DBAPI contained its own type-registry like system  
(which would likely be per-connection, since thats the highest-level  
object DBAPI provides which is still stateful with regards to a  
particular database connection), SA could probably modify TypeEngine  
to move its type-conversion code into this layer, instead of having  
to piggyback the translation onto result set objects.

However i might suggest that this whole thread, "controlling return  
types", perhaps be expanded to include "controlling *input* types and  
return types", since to me (as well as to SQLAlchemy) being able to  
send an arbitrarily-typed python object into a bind parameter is just  
the mirror image of receiving a result column as an arbitrarily-typed  
python object.  I think it would be unfortunate if only one half of  
the coin were addressed.


From Chris.Clark at ingres.com  Tue Apr 17 20:43:30 2007
From: Chris.Clark at ingres.com (Chris Clark)
Date: Tue, 17 Apr 2007 11:43:30 -0700
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>	<4624826E.5080703@egenix.com>	<1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
Message-ID: <46251552.1030709@ingres.com>

Michael Bayer wrote:
> ........However i might suggest that this whole thread, "controlling return  
> types", perhaps be expanded to include "controlling *input* types and  
> return types"........

The pysqlite register_adapter and register_converter do this (in that 
order). I've not used it in anger but I like the simplicity of the API, 
see 
http://initd.org/pub/software/pysqlite/doc/usage-guide.html#converting-sqlite-values-to-custom-python-types 
for an example. pysqlite approach does have implications for the SQL 
that is then used but that isn't relevant to other databases so the 
api/approach could be appropriated :-)

I'm not familiar with psycopg's approach to be able to compare with 
pysqlite without research.

The big question is when should mapping take place? On the connection 
level, statement level, or column level.

E.g. mapping varchar to a python varchar type that contains length and data

    connection level - all varchars in DBMS get returned get returned as
    varchar python type
    statement level - all varchars in specific .execute() get returned
    as varchar python type
    column level - only specific varchars in specific .execute() get
    returned as varchar python type, i.e. some varchars could come back
    as regular python strings


The pysqlite approach is to use strings as type identifiers for the 
converter, I get the impression Art is not in favor if this approach but 
the dbms type param is likely to be DBMS specific so whether it is a 
string, a tuple of major/minor, etc. may not matter. If the driver 
offers constants for basic types that would be fine.

Here is an example following the pysqlite approach, using connection 
level mapping. It is usually easy to discuss what is good/bad about an 
idea if there is an example to critic:

    import mydbapi

    # Register the python2db adapter
    mydbapi.register_adapter(mydbapi.varcharClass, mydbapi.adapt_varchar)

    # Register the db2python converter
    mydbapi.register_converter(mydbapi.varcharType, mydbapi.convert_varchar)


    con = mydbapi.connect(".......", use_type_mapping=TRUE)
    cur = con.cursor()
    cur.execute("select myvarchar from mytable")
    cur.fetch()[0] ## type would be Python type "mydbapi.varcharClass"


Chris


From fog at initd.org  Tue Apr 17 22:08:47 2007
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 17 Apr 2007 22:08:47 +0200
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local>
References: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local>
Message-ID: <1176840527.3735.14.camel@mila>

Il giorno mar, 17/04/2007 alle 10.35 -0700, Robert Brewer ha scritto:
[...]
> This is where Geniusql is headed. I'm not for a moment saying the DBAPI
> should go that far, but there needs to be a clear understanding of
> exactly how far the DBAPI is going to go down this rabbit hole (because
> however far you go, your user base will forever pester you for the next
> level of flexibility ;).

Rewriting the SQL is, imho, not in the scope of the DBAPI. When I talk
about casting types from SQL to Python and back I mean exactly that, not
some (probably very useful) way to automatically generate SQL to patch
the shortcomings of the backend/previous programmer.

> 
> > Let's suppose that a driver "knows" the type of a DB
> > column, then we can ask it for an abstract "dbtype":
> > 
> > dbtype = connection_object.getdbtype("SELECT 1 AS foo")
> > 
> > where the query _must_ return a scalar from which the driver 
> > infers the type. Then the type can be used as a key in the
> > registry.
> 
> Given the extremely small number of datatypes that each commercial
> database exposes, this seems to be both more work and less accurate
> results than simply modeling each concrete dbtype directly. All
> SQL92-compliant types can be fully described with a handful of
> attributes (bytes, precision, scale, whether each of those is
> user-specifiable, and if so the maximum allowed value for each, whether
> a numeric type is signed or unsigned, and finally the CHAR vs VARCHAR
> distinction). [1] 

Yes, but there should be a way to obtain the "type" from the backend.
Especially for backends like PostgreSQL that allow for user defined
types of any complexity.

> There should be some provision for custom converters (which forces you
> to stick the converters on each column object instead of in a >   registry,
> since there can be several different converters for e.g.
> datetime.date-to-CHAR).
>
> But even if you decide not to go that far, the registry of default
> converters will need to be keyed by (pytype, backend-specific dbtype).
> For example, Postgres has a hard time comparing FLOAT4 and FLOAT8 [2],
> not to mention that the concrete precision of SQL92 REAL and DOUBLE are
> "implementation defined". It's not entirely hopeless; some base classes
> for converters can be constructed [3].

You're talking about two different things here. I am not interested at
all at solving at the Python level problems that inerently live at the
backend level, like the FLOAT4/FLOAT8 problem.

Imho all that the DBAPI need is a way to specify type-casting functions
(at global, connection and maybe column level) that allow the programmer
to obtain exactly the Python type they need, given a backend type.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
 - Ma cos'ha il tuo pesce rosso, l'orchite?
 - Si, ha un occhio solo, la voce roca e mangia gli altri pesci.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/4244ba3a/attachment-0001.pgp 

From fog at initd.org  Tue Apr 17 22:12:18 2007
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 17 Apr 2007 22:12:18 +0200
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
Message-ID: <1176840738.3735.18.camel@mila>

Il giorno mar, 17/04/2007 alle 13.53 -0400, Michael Bayer ha scritto:
> However i might suggest that this whole thread, "controlling return  
> types", perhaps be expanded to include "controlling *input* types and  
> return types", since to me (as well as to SQLAlchemy) being able to  
> send an arbitrarily-typed python object into a bind parameter is just  
> the mirror image of receiving a result column as an arbitrarily-typed  
> python object.  I think it would be unfortunate if only one half of  
> the coin were addressed. 

Sure. ;)

The Python->SQL part is perfect for adaptation and, for example, psycopg
has a micro-protocols implementation to help with the adaptation of any
Python object into a valid ISQLQuote one. Why two different systems?
Because transating Python to SQL and SQL to Python are two very
different operations, IMHO.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
              La felicit? ? una tazza di cioccolata calda. Sempre. -- Io
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/f47dfed5/attachment.pgp 

From fumanchu at amor.org  Tue Apr 17 22:24:07 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 17 Apr 2007 13:24:07 -0700
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <4624DEFA.30501@research.att.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A860B1BCE17@ex9.hostedexchange.local>

Art Protin wrote:
> Our database system has an nearly unbounded set of types.
> The types have two components, say a major and minor,
> or a main type and subtype.  The main type "STRING" alone
> has 65535 subtypes (one for each allowable size).
> Other main types may a few subtypes or even no subtypes
> Some of the subtypes make a major difference in the
> conversion function behavior (like those for DATE)
> and some make nearly none. My conversion routines are
> called based on the main type but need both the data
> value and the subtype as arguments.  Do any of the other
> systems have such a multi-level type scheme as this?

Geniusql does, but it's implemented by using classes to represent each
"main type" and instances (with varying attributes) for each subtype.
Each Column object gets its own DatabaseType instance (and so does each
expression when generating SQL).

For example, the firebird.py provider supplies a concrete VARCHAR class
(a subclass of the abstract dbtypes.SQL92VARCHAR class), and instances
of that class can have a "bytes" attribute anywhere from 1 to 32767 (the
default is 63):

    class VARCHAR(dbtypes.SQL92VARCHAR):
        synonyms = ['CHARACTER VARYING', 'CHAR VARYING', 'VARYING']
        max_bytes = 32767
        _bytes = 63


Given sufficient parameterization, there's usually no need to statically
model all 65535 STRING types in systems like these.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From mike_mp at zzzcomputing.com  Wed Apr 18 02:50:32 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Tue, 17 Apr 2007 20:50:32 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1176840738.3735.18.camel@mila>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
Message-ID: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>


On Apr 17, 2007, at 4:12 PM, Federico Di Gregorio wrote:

>
> The Python->SQL part is perfect for adaptation and, for example,  
> psycopg
> has a micro-protocols implementation to help with the adaptation of  
> any
> Python object into a valid ISQLQuote one. Why two different systems?
> Because transating Python to SQL and SQL to Python are two very
> different operations, IMHO.

they are different operations, but their semantics are mirror images  
of one another.   it follows very closely if you are binding a  
datetime object to a bind param which is in a SQL expression being  
compared against a particular column, that the rules which convert  
the datetime to a SQL value are the same rules in reverse that would  
convert the selection of that column in the result.  this is why i  
think the "type" of a column, both its bind param adaptation as well  
as its result row adaptation, can be expressed by the same rule  
object in most cases.  and I know it works since this is how  
sqlalchemy has been doing it for quite a while now.


From anthony.tuininga at gmail.com  Wed Apr 18 19:35:36 2007
From: anthony.tuininga at gmail.com (Anthony Tuininga)
Date: Wed, 18 Apr 2007 11:35:36 -0600
Subject: [DB-SIG] cx_Oracle 4.3.1
Message-ID: <703ae56b0704181035j7d165c45pa1bd591af9ba2a20@mail.gmail.com>

What is cx_Oracle?

cx_Oracle is a Python extension module that allows access to Oracle and
conforms to the Python database API 2.0 specifications with a few
exceptions.


Where do I get it?

http://starship.python.net/crew/atuining


What's new?

1) Ensure that if the client buffer size exceeds 4000 bytes that the
server buffer size does not as strings may only contain 4000 bytes;
this allows handling of multibyte character sets on the server as well
as the client.
2) Added support for using buffer objects to populate binary data and
made the Binary() constructor the buffer type as requested by Ken
Mason.
3) Fix potential crash when using full optimization with some
compilers. Thanks to Aris Motas for noticing this and providing the
initial patch and to Amaury Forgeot d'Arc for providing an even
simpler solution.
4) Pass the correct charset form in to the write call in order to
support writing to national character set LOB values properly. Thanks
to Ian Kelly for noticing this discrepancy.

Anthony Tuininga

From anthony.tuininga at gmail.com  Fri Apr 20 18:31:52 2007
From: anthony.tuininga at gmail.com (Anthony Tuininga)
Date: Fri, 20 Apr 2007 10:31:52 -0600
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
Message-ID: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>

I've been following this thread and it would appear that no real
consensus has been reached as of yet. I've looked at the api used by
sqlite and knowing its storage and type definition system it makes
good sense. I am considering adding the following to cx_Oracle,
following some of the examples given so far with modifications needed
for Oracle, and I'd appreciate any input you might have.

cursor.setdefaulttype(databaseType, type)
connection.setdefaulttype(databaseType, type)

What this method would do is specify that whenever an item that is
represented on the database by the given database type is to be
retrieved, the specified type should be used instead of the default.
This would allow for a global or local specification that numbers are
to be returned as strings or decimal.Decimal objects or that strings
are to be returned as unicode objects, for example.

cursor.settype(position, type)

This would allow specification of the type to use for a particular
column being fetched.

registeradapter(type, databaseType, fromPythonMethod, toPythonMethod)

This would specify that whenver an object of the given type is bound
to a cursor, that the fromPythonMethod method would be invoked with
the value and would expect a return value that can be directly bound
to the databaseType. The toPythonMethod method would be invoked when
columns are retrieved and would accept the databaseType value and
expect back a value of the given type.

Some help on the names would be appreciated as well -- its the worst
part of programming. :-) I've tried to use the DB API style of naming
-- all lower case without any underscores even though it isn't my
personal favorite.

Any comments?

On 4/17/07, Michael Bayer <mike_mp at zzzcomputing.com> wrote:
>
> On Apr 17, 2007, at 4:12 PM, Federico Di Gregorio wrote:
>
> >
> > The Python->SQL part is perfect for adaptation and, for example,
> > psycopg
> > has a micro-protocols implementation to help with the adaptation of
> > any
> > Python object into a valid ISQLQuote one. Why two different systems?
> > Because transating Python to SQL and SQL to Python are two very
> > different operations, IMHO.
>
> they are different operations, but their semantics are mirror images
> of one another.   it follows very closely if you are binding a
> datetime object to a bind param which is in a SQL expression being
> compared against a particular column, that the rules which convert
> the datetime to a SQL value are the same rules in reverse that would
> convert the selection of that column in the result.  this is why i
> think the "type" of a column, both its bind param adaptation as well
> as its result row adaptation, can be expressed by the same rule
> object in most cases.  and I know it works since this is how
> sqlalchemy has been doing it for quite a while now.
>
>
> _______________________________________________
> DB-SIG maillist  -  DB-SIG at python.org
> http://mail.python.org/mailman/listinfo/db-sig
>

From Chris.Clark at ingres.com  Fri Apr 20 19:28:03 2007
From: Chris.Clark at ingres.com (Chris Clark)
Date: Fri, 20 Apr 2007 10:28:03 -0700
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>	<4624826E.5080703@egenix.com>
	<1176798574.3630.26.camel@mila>	<4624DEFA.30501@research.att.com>	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>	<1176840738.3735.18.camel@mila>	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
Message-ID: <4628F823.2020009@ingres.com>

Anthony Tuininga wrote:
> I've been following this thread and it would appear that no real
> consensus has been reached as of yet. I've looked at the api used by
> sqlite and knowing its storage and type definition system it makes
> good sense. I am considering adding the following to cx_Oracle,
> following some of the examples given so far with modifications needed
> for Oracle, and I'd appreciate any input you might have.
>
> cursor.setdefaulttype(databaseType, type)
> connection.setdefaulttype(databaseType, type)
>
> What this method would do is specify that whenever an item that is
> represented on the database by the given database type is to be
> retrieved, the specified type should be used instead of the default.
> This would allow for a global or local specification that numbers are
> to be returned as strings or decimal.Decimal objects or that strings
> are to be returned as unicode objects, for example.
>
> cursor.settype(position, type)
>
> This would allow specification of the type to use for a particular
> column being fetched.
>
> registeradapter(type, databaseType, fromPythonMethod, toPythonMethod)
>
> This would specify that whenver an object of the given type is bound
> to a cursor, that the fromPythonMethod method would be invoked with
> the value and would expect a return value that can be directly bound
> to the databaseType. The toPythonMethod method would be invoked when
> columns are retrieved and would accept the databaseType value and
> expect back a value of the given type.
>
> Some help on the names would be appreciated as well -- its the worst
> part of programming. :-) I've tried to use the DB API style of naming
> -- all lower case without any underscores even though it isn't my
> personal favorite.
>
> Any comments?

My initial reaction is that I like it! I like registeradapter() and that 
it is easy to set at the connection or cursor level.

I'm guessing that the cursor.settype() call is for a result set only and 
that the adapter would be reset on a new cursor.execute() call?

I'm also wondering if setdefaulttype() should have param 2 as an 
optional param (i.e. if there is only one registered adapter the driver 
can work out the 2nd param).

It is probably worth defining a conflict resolution approach, even if 
the approach is in documentation and says, "the behavior of conflicting 
types in adapters is undefined"! E.g. sending to db conflict (note 
fairly artificial):

    registeradapter(str, DECIMAL, pyStr2dbDec, dbDec2pyStr)
    registeradapter(str, SPATIAL, pyStr2dbSpa, dbSpa2pyStr)
    cursor.setdefaulttype(DECIMAL, str)
    cursor.setdefaulttype(SPATIAL, str)
    cursor.execute('select x from mytable where mytable.col1 = ?',
    ('12.34',))
    ## is the input supposed to be decimal or a spatial type?


The alternatives are:

   1. for the database driver to do some sort of DESCRIBE INPUT and work
      out which adapter to use
   2. to raise an error when registeradapter() is called with
      conflicting types

Any comments? Should this be driver dependent?


As for names, I've a few suggestions but I don't feel strongly about the 
names:

    setdefaulttype --> coercetype
    settype --> coercecolumn


registeradapter() is clear, I wondered about setadapter() instead, but 
registeradapter() is probably the most clear.

Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070420/dd4f0d1a/attachment.htm 

From anthony.tuininga at gmail.com  Fri Apr 20 21:57:03 2007
From: anthony.tuininga at gmail.com (Anthony Tuininga)
Date: Fri, 20 Apr 2007 13:57:03 -0600
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <4628F823.2020009@ingres.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<4628F823.2020009@ingres.com>
Message-ID: <703ae56b0704201257x23a0eecfta09365a241078f5b@mail.gmail.com>

On 4/20/07, Chris Clark <Chris.Clark at ingres.com> wrote:
>
>  Anthony Tuininga wrote:
>  I've been following this thread and it would appear that no real
> consensus has been reached as of yet. I've looked at the api used by
> sqlite and knowing its storage and type definition system it makes
> good sense. I am considering adding the following to cx_Oracle,
> following some of the examples given so far with modifications needed
> for Oracle, and I'd appreciate any input you might have.
>
> cursor.setdefaulttype(databaseType, type)
> connection.setdefaulttype(databaseType, type)
>
> What this method would do is specify that whenever an item that is
> represented on the database by the given database type is to be
> retrieved, the specified type should be used instead of the default.
> This would allow for a global or local specification that numbers are
> to be returned as strings or decimal.Decimal objects or that strings
> are to be returned as unicode objects, for example.
>
> cursor.settype(position, type)
>
> This would allow specification of the type to use for a particular
> column being fetched.
>
> registeradapter(type, databaseType, fromPythonMethod, toPythonMethod)
>
> This would specify that whenver an object of the given type is bound
> to a cursor, that the fromPythonMethod method would be invoked with
> the value and would expect a return value that can be directly bound
> to the databaseType. The toPythonMethod method would be invoked when
> columns are retrieved and would accept the databaseType value and
> expect back a value of the given type.
>
> Some help on the names would be appreciated as well -- its the worst
> part of programming. :-) I've tried to use the DB API style of naming
> -- all lower case without any underscores even though it isn't my
> personal favorite.
>
> Any comments?
>
>  My initial reaction is that I like it! I like registeradapter() and that it
> is easy to set at the connection or cursor level.

Ok, that's good. :-)

>  I'm guessing that the cursor.settype() call is for a result set only and
> that the adapter would be reset on a new cursor.execute() call?

Mostly correct. Yes, it is for result sets only although the
possibilty of donig the same for output bind variables should be
considered as well -- but that would require a different method
signature or a change to setinputsizes(). The setting would remain so
long as the same statement was executed again. When a new statement is
prepared, this setting would revert to the default values (defined by
the setdefaulttype() calls).

>  I'm also wondering if setdefaulttype() should have param 2 as an optional
> param (i.e. if there is only one registered adapter the driver can work out
> the 2nd param).

Well, if there was only one type, there wouldn't be much point in
calling setdefaulttype() would there? The default would of course be
the only one available and saying so wouldn't make it any more so...
:-) Unless I'm missing something and you meant something else?

>  It is probably worth defining a conflict resolution approach, even if the
> approach is in documentation and says, "the behavior of conflicting types in
> adapters is undefined"! E.g. sending to db conflict (note fairly
> artificial):
>
>
> registeradapter(str, DECIMAL, pyStr2dbDec, dbDec2pyStr)
>  registeradapter(str, SPATIAL, pyStr2dbSpa, dbSpa2pyStr)
>  cursor.setdefaulttype(DECIMAL, str)
>  cursor.setdefaulttype(SPATIAL, str)
>  cursor.execute('select x from mytable where mytable.col1 = ?', ('12.34',))
>  ## is the input supposed to be decimal or a spatial type?
>
>  The alternatives are:
>
>
>
> for the database driver to do some sort of DESCRIBE INPUT and work out which
> adapter to use
>
> to raise an error when registeradapter() is called with conflicting types
>  Any comments? Should this be driver dependent?

Hmm, I was assuming that the adapters would be indexed by type and
that setting one would override the previous one. In other words, to
do what you were hoping to do would require a different Python type
for decimal and spatial. Having the database figure out which type to
use would be difficult at best and impossible in some situations. In
addition, the situation you give is a bind variable which is normally
controlled by setinputsizes(). I wasn't intending to change the
signature of this method but perhaps a new method should be added?

cursor.setbindtype(nameOrPosition, databaseType, [type])

which would allow you to specify the database type and optionally the
Python type (for output bind variables). At that point the same rules
as described above would apply. This is overlap with setinputsizes()
though so I'm not sure whether or not this is a good idea.

>  As for names, I've a few suggestions but I don't feel strongly about the
> names:
>
>
> setdefaulttype --> coercetype
>  settype --> coercecolumn

Hmm, I think I prefer the setXXX() type methods as they are similar in
content to the setinputsizes() and setoutputsizes() methods. But I can
see your point about coerceXXX() as that is in fact what is happening.

>  registeradapter() is clear, I wondered about setadapter() instead, but
> registeradapter() is probably the most clear.

I think registeradapter() is clearer, too.

>  Chris
>
>

From fumanchu at amor.org  Sat Apr 21 02:28:04 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Fri, 20 Apr 2007 17:28:04 -0700
Subject: [DB-SIG] Controlling return types for DB APIs
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com><4624826E.5080703@egenix.com>
	<1176798574.3630.26.camel@mila><4624DEFA.30501@research.att.com><10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com><1176840738.3735.18.camel@mila><6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>

Anthony Tuininga wrote:
> cursor.setdefaulttype(databaseType, type)
> connection.setdefaulttype(databaseType, type)
> 
> What this method would do is specify that whenever
> an item that is represented on the database by the
> given database type is to be retrieved, the specified
> [Python] type should be used instead of the default.
> This would allow for a global or local specification
> that numbers are to be returned as strings or
> decimal.Decimal objects or that strings are to be
> returned as unicode objects, for example.
> 
> cursor.settype(position, type)
> 
> This would allow specification of the type to use
> for a particular column being fetched.

As soon as you provide cursor.settype(dbtype, pytype),
you should expect someone to ask for
"pytype = cursor.gettype(dbtype)", and once you've written
that, you'll realize they're both spelled better as
"cursor.types[dbtype] = pytype" (for set) and
"pytype = cursor.types[dbtype]" (for get). Same thing
goes for adapters: once you allow people to set them,
you should expect people will want to inspect them.
So a cursor.adapters object (copied from a similar
connection.adapters object) should be used. Namespaces
are one honking great idea. The container classes don't
*have* to use slicing (you could go all Java-ish and
use get() and set()), but slicing is the most natural
choice in this case.

You also should be very explicit about the direction
of each operation when building type- or language-
translation layers. That is, include the direction
in the names of every method and object, because at
some point, people will want to do the reverse.
A Python method named "coerce" is not as good as a
method named "coerce_in" or "coerce_from_database";
the namespaced spelling, "adapter_in.coerce" is even
better.

> registeradapter(type, databaseType, fromPythonMethod,
>     toPythonMethod)
> 
> This would specify that whenver an object of the given
> [Python] type is bound to a cursor, that the
> fromPythonMethod method would be invoked with the
> value and would expect a return value that can be
> directly bound to the databaseType. The toPythonMethod
> method would be invoked when columns are retrieved and
> would accept the databaseType value and expect back a
> value of the given type.

That sounds good if by "databaseType" you mean
"class VARCHAR"; that is, a Python type which models
a database type. Because there's no such thing as a
"database value" you can pass to a toPythonMethod;
it *must* have already been converted into some Python
object of a Python type (unless you're writing your
adapters in C). The best you can do is have a Python
type (designed/selected for minimum information loss)
to which the incoming value gets coerced before
passing it to your toPythonMethod for further
adaptation or casting.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070420/10695523/attachment.htm 

From carsten at uniqsys.com  Sat Apr 21 03:34:02 2007
From: carsten at uniqsys.com (Carsten Haese)
Date: Fri, 20 Apr 2007 21:34:02 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
Message-ID: <1177119242.3305.52.camel@localhost.localdomain>

Now that it's the weekend, I'd like to chime in. I had been thinking for
a while now about type conversions from the Informix angle. Since
Informix allows user-defined types, I'd like to implement a type
conversion scheme that is flexible enough to allow specifying type
conversions for any data type including UDTs.

Since I didn't consider myself creative enough to come up with a good
design by myself, I looked over the fence at other database
implementations, including PostgreSQL and sqlite, and to my utter
surprise, the one scheme that seems most Pythonic to me is the way JDBC
handles user-defined types.

The idea is that the programmer sets up type mappings. In Python, a type
mapping would just be a dictionary whose key is the database type (more
on that later) and whose value is a class object that derives from an
abstract SQLData class that would be defined by the API module. This
mapping would be stored on the connection as a default mapping, and the
connection's cursors will inherit shallow copies of this mapping.

A natural choice for the key in this mapping is the type indicator that
the API implementation already returns in cursor.description. The only
possible hangup would be if a DB-API implementation uses mutable objects
for these, but in my opinion that would be insane. All implementations
I'm aware of either use strings or integers for the SQL type indicator.

When a value is returned from the database, the computer checks if its
type is mapped. If yes, the constructor of the corresponding
SQLData-derived class is called with the value's "canonical" Python
representation as the only argument. The canonical representation is the
value that the API would return if no type map were in effect, which
would be the best, lossless Python equivalent of the data type in
question.

The SQLData-derived class may, of course, return an object of a
different type of object from its __new__ method (which would be useful
to map character data to unicode objects, for example), but in order to
allow seamless round-trips of data from the database to the application
and back to the database, the returned value should be directly usable
as an input parameter for the type of column that it came from.

For handling type conversions to the database, SQLData instances would
implement a ToDB method that would perform the reverse operation of the
constructor, i.e. to render the canonical Python representation of the
instance's contents, which can then be bound to input parameters in the
canonical way. 

This proposal does not address special per-column mappings, but I don't
think it needs to. In my experience it's rare that I'd want two columns
of the same type from the same query to be mapped to two different
Python types.

For handling exceptional circumstances, say e.g. you inherit a messed up
database that stores timestamps as nanoseconds since the big bang that
you automatically want to convert to a datetime object, I suggest
standardizing the concept of row factory functions. In a nutshell,
cursor objects would have an optional callable rowfactory attribute. If
a rowfactory is specified, it will translate between what a fetch would
normally return and what it should return instead.

Let me know what you think.

-Carsten


From carsten at uniqsys.com  Sat Apr 21 09:09:47 2007
From: carsten at uniqsys.com (Carsten Haese)
Date: Sat, 21 Apr 2007 03:09:47 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1177119242.3305.52.camel@localhost.localdomain>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
Message-ID: <1177139387.3208.6.camel@localhost.localdomain>

Hi All,

I have taken the time to write out my type mapping proposal in a
slightly more structured form, with some hopefully enlightening examples
of how this proposal might be useful.

Please see http://www.uniqsys.com/~carsten/typemap.html

Any comments are welcome, and I'll do my best to incorporate
constructive criticism into future revisions of this proposal.

-Carsten


From mike_mp at zzzcomputing.com  Sat Apr 21 16:52:31 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Sat, 21 Apr 2007 10:52:31 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1177139387.3208.6.camel@localhost.localdomain>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
Message-ID: <CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>


On Apr 21, 2007, at 3:09 AM, Carsten Haese wrote:

> Hi All,
>
> I have taken the time to write out my type mapping proposal in a
> slightly more structured form, with some hopefully enlightening  
> examples
> of how this proposal might be useful.
>
> Please see http://www.uniqsys.com/~carsten/typemap.html
>
> Any comments are welcome, and I'll do my best to incorporate
> constructive criticism into future revisions of this proposal.
>

heres some thoughts:

- using a class-level approach, i.e. SQLData, makes it inconvenient  
to establish custom types that are independent of a particular DBAPI,  
since the SQLData class itself like everything else in DBAPI only  
exists within implementations.   its impossible to define the class  
until you've imported some DBAPI.  SQLData's origins in JDBC dont  
have this issue since SQLData is part of the abstract JDBC api and  
classes can be built against it independently of any database driver  
being available.

- because SQLData's state is the data itself, SQLData is not really a  
"type" at all, its a value object which includes its own database  
translation function.  That greatly limits what kinds of types  
SQLData can be realistically used for, and in fact it can only be  
used for datatypes that are explicitly aware that they are being  
stored in a database - and only a specific DBAPI too.

For example, its impossible to use SQLData to directly represent  
Decimal instances or datetime instances; neither of them subclass  
<mydbapi>.SQLData.    If the answer is that we'd just use typemaps  
for those, then what would we use SQLData for ?  I can use a typemap  
for my SpatialData objects just as easily, without my SpatialData  
object being welded to a specific persistence scheme and specific DBAPI.

Also because SQLData is not stateful with regards to its type, its  
not possible for a single SQLData class to represent variants of a  
particular type, such as strings that should be truncated to length  
50 versus strings that are truncted to length 100; youd have to use  
more subclassing.

- per-column typemaps:  here is a common use case.  I am receiving a  
row which contains two BLOB coluimns.  one BLOB is image data  
representing a JPEG image, one BLOB is a pickled instance of a Python  
class.  I would like to register type converters so that the second  
column is run through the pickle.loads() function but not the  
first.   If we are registering various type-handling callables at the  
cursor level, it should be easy enough to add an optional integer  
parameter which will bind that type converter to only a specific  
column position in a result set.  the use case is more obvious in the  
bind parameter direction.


From carsten at uniqsys.com  Sat Apr 21 18:18:01 2007
From: carsten at uniqsys.com (Carsten Haese)
Date: Sat, 21 Apr 2007 12:18:01 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
Message-ID: <1177172281.3191.79.camel@localhost.localdomain>

On Sat, 2007-04-21 at 10:52 -0400, Michael Bayer wrote:
> On Apr 21, 2007, at 3:09 AM, Carsten Haese wrote:
> 
> > Hi All,
> >
> > I have taken the time to write out my type mapping proposal in a
> > slightly more structured form, with some hopefully enlightening  
> > examples
> > of how this proposal might be useful.
> >
> > Please see http://www.uniqsys.com/~carsten/typemap.html
> >
> > Any comments are welcome, and I'll do my best to incorporate
> > constructive criticism into future revisions of this proposal.
> >
> 
> heres some thoughts:

Thanks for taking the time to read the proposal. You are making some
good points. If you don't want to read all my responses, skip to the
summary at the bottom.

> - using a class-level approach, i.e. SQLData, makes it inconvenient  
> to establish custom types that are independent of a particular DBAPI,  
> since the SQLData class itself like everything else in DBAPI only  
> exists within implementations.   its impossible to define the class  
> until you've imported some DBAPI.  SQLData's origins in JDBC dont  
> have this issue since SQLData is part of the abstract JDBC api and  
> classes can be built against it independently of any database driver  
> being available.

It's not impossible. You could always mix-in the somedb.SQLData base
class into a generic Python object after importing somedb. Also, since
the point is for an application to define type mappings for a particular
database, I don't see the limitation in making the SQLData class
specific to the API module that the application will usually have
imported already anyway.

In my opinion, making the SQLData class specific to the API module is
necessary since the ToDB method for translating a particular object to
the database may differ from database to database.

> - because SQLData's state is the data itself, SQLData is not really a  
> "type" at all, its a value object which includes its own database  
> translation function.  That greatly limits what kinds of types  
> SQLData can be realistically used for, and in fact it can only be  
> used for datatypes that are explicitly aware that they are being  
> stored in a database - and only a specific DBAPI too.

Isn't that the point of defining a bidirectional type mapping from/to
the database?

> For example, its impossible to use SQLData to directly represent  
> Decimal instances or datetime instances; neither of them subclass  
> <mydbapi>.SQLData.

True, but on one hand, datetime instances and Decimal instances should
be handled by the API's canonical mapping already, and on the other hand
the somedb API could always choose to return datetimes as objects that
derive from both datetime and somedb.SQLData.

>     If the answer is that we'd just use typemaps  
> for those, then what would we use SQLData for ?  I can use a typemap  
> for my SpatialData objects just as easily, without my SpatialData  
> object being welded to a specific persistence scheme and specific DBAPI.

Of course you could, but "welding" the object to a specific DB-API is
what allows the object to be passed transparently as an input parameter
into queries against that specific database.

A corollary of the principle of least surprise is that it should always
be possible to take the result of a select query and insert that object
into the column that it was read from. Inheriting from SQLData is what
allows this seamless select-insert round-trip.

Having said all that, I'm not married to the idea of requiring the
application-side objects to derive from a particular SQLData class. For
the purpose of input binding, it would be enough, and more in line with
the idea of duck-typing, if the object provided an agreed-upon method,
e.g. "ToDB" that the DB-API can call to translate between application
type and canonical database type.

Essentially, the proposed input translation could change to

if hasattr(in_param, "ToDB"):
    in_param = in_param.ToDB()

or something like that. It may be beneficial to allow this call to pass
more parameters in order to tell the object something about the context
in which the conversion is occurring, including but not limited to the
name of the API module, the active connection/cursor, and, if available,
the descriptor of the database-side column type and name that the object
is destined for.

> Also because SQLData is not stateful with regards to its type, its  
> not possible for a single SQLData class to represent variants of a  
> particular type, such as strings that should be truncated to length  
> 50 versus strings that are truncted to length 100; youd have to use  
> more subclassing.

As proposed, SQLData *is* stateful with regards to the type. By default,
it's not stateful with regards to subtype, length, and precision, but
this can, and should, be added. If the constructor is given the complete
cursor.description entry that goes along with the value, it has
everything it needs to remember this information.
 
> - per-column typemaps:  here is a common use case.  I am receiving a  
> row which contains two BLOB coluimns.  one BLOB is image data  
> representing a JPEG image, one BLOB is a pickled instance of a Python  
> class.  I would like to register type converters so that the second  
> column is run through the pickle.loads() function but not the  
> first.   If we are registering various type-handling callables at the  
> cursor level, it should be easy enough to add an optional integer  
> parameter which will bind that type converter to only a specific  
> column position in a result set.  the use case is more obvious in the  
> bind parameter direction.

Yes, I already suggested this (passing the column number to the outbound
adapter) as a possible extension. However, the use case is convincing
enough that we should probably allow for a more convenient per-column
mapping that allows dispatching the conversion to a different adapter
callable altogether, rather than having to define one adapter that
returns one thing or another depending on which column it's converter.

To handle this, the cursor could grow a coltypemap attribute, which is a
mapping of typemaps, keyed on the column number or, maybe more
conveniently, column name.

In summary, I am open to making the following revisions:
* The SQLData class would become optional or be eliminated. Inbound type
conversions between Python objects and the database will be performed by
a well-defined ToDB method that the object may implement regardless of
its inheritance tree. If an inbound Python object doesn't define a ToDB
method, it'll be mapped by the canonical mapping for the particular
database.
* The outbound conversion call will receive additional parameters, such
as the cursor.description tuple, that will allow the adapter to make the
resulting object stateful with respect to all of its database type
properties.
* Add an optional coltypemap attribute to the cursor for defining a
column-specific typemap.

Unless I'm missing something, these revisions should address all the
points you have brought up.

-Carsten


From mike_mp at zzzcomputing.com  Sat Apr 21 19:33:04 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Sat, 21 Apr 2007 13:33:04 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1177172281.3191.79.camel@localhost.localdomain>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
Message-ID: <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>


On Apr 21, 2007, at 12:18 PM, Carsten Haese wrote:

>
> Thanks for taking the time to read the proposal. You are making some
> good points. If you don't want to read all my responses, skip to the
> summary at the bottom.
>

of course ill read fully ! :)

>
> In my opinion, making the SQLData class specific to the API module is
> necessary since the ToDB method for translating a particular object to
> the database may differ from database to database.

thats true, but there are also cases where it doesnt. the  
"PickleType" example is one where all we want to do is shuttle a  
binary stream to/from the pickle API. I also have an interest in  
creating code that is database portable....the connection objects  
dealt with are produced by a factory, where the actual DBAPI module  
is private to one particular implementation of that factory.

>
>> - because SQLData's state is the data itself, SQLData is not really a
>> "type" at all, its a value object which includes its own database
>> translation function.  That greatly limits what kinds of types
>> SQLData can be realistically used for, and in fact it can only be
>> used for datatypes that are explicitly aware that they are being
>> stored in a database - and only a specific DBAPI too.
>
> Isn't that the point of defining a bidirectional type mapping from/to
> the database?

the point of bidirectional type mapping is to define database  
persistence conversions for a particular application data type.  but  
decoupling the application data type from a particular persistence  
strategy allows the persistence mappings to vary independently of the  
data type and the rest of the application on which it depends.

>
>>     If the answer is that we'd just use typemaps
>> for those, then what would we use SQLData for ?  I can use a typemap
>> for my SpatialData objects just as easily, without my SpatialData
>> object being welded to a specific persistence scheme and specific  
>> DBAPI.
>
> Of course you could, but "welding" the object to a specific DB-API is
> what allows the object to be passed transparently as an input  
> parameter
> into queries against that specific database.
>
> A corollary of the principle of least surprise is that it should  
> always
> be possible to take the result of a select query and insert that  
> object
> into the column that it was read from. Inheriting from SQLData is what
> allows this seamless select-insert round-trip.

Well, in this case, SpatialData is related to its persistence  
implementation through subclassing.  but its just as easy for  
SpatialData to be related to its persistence implementation via  
assocation without the subclassing requirement.    SQLAlchemy  
provides full-round trip capability of any type you want and uses the  
same mechanism for all types, including at the column level, without  
any different treatment of "Python" types and user-defined types.     
On both sides of the trip, all thats required is a dictionary that  
maps TypeEngine subclasses (which purely define database translation  
strategies) to either bind param names/positions and/or result column  
names/positions.  mapping to DBAPI/python types is just one more way  
of doing that (maybe I should look into adding that dimension to SA's  
implementation...)

>
> Essentially, the proposed input translation could change to
>
> if hasattr(in_param, "ToDB"):
>     in_param = in_param.ToDB()
>

OK, duck typing is much better and more analgous to JDBC's usage of  
an interface.   this solves the module-importing issue, but not  
necessarily the "different db's might require different ToDB()  
implementations" problem - it still binds my application-level value  
objects to an assumption about their storage...and if my application  
suddenly had to support two different databases, or even to persist  
the same collections of objects in both of those DBs (there are  
definitely apps that do this), now my program design has to create  
copies of values to handle the discrepancy.   the same issue exists  
for an application value that is stored in multiple places within the  
same database, but in different ways; such as a Date type that is  
stored both in some legacy table with a particuilar string-format  
style of storage and some newer table with a decimal-based storage  
format (or a different string format).

a behind-the-scenes registry of converters mapped to my application's  
types solves the multiple-databases problem, and bind/column-mapped  
converters solve the multiple-tables problem.

the non-class-bound approach, using registered converters, looks like:

converter = cursor.type_mappings.get(type(in_param), None)
if converter is not None:
     in_param = converter.ToDB(in_param)

that removes all interface responsibilities from in_param's class.

However, I can see the value in the presence of ToDB() (and FromDB()  
classmethods perhaps) being useful from strictly a convenience point  
of view.  that is, in the common use case that the persistence of a  
particular kind of object has no complex requirements.  but im not  
sure if DBAPI itself should present both a generalized method as well  
as a "convenience/80% case" method (of which ToDB() is the latter).   
If I wanted a SQLData-like class in my own application, I could easy  
enough create a metaclass approach that automatically registers the  
object's type-conversion methods using the generic typing system.


>
>> - per-column typemaps:  here is a common use case.  I am receiving a
>> row which contains two BLOB coluimns.  one BLOB is image data
>> representing a JPEG image, one BLOB is a pickled instance of a Python
>> class.  I would like to register type converters so that the second
>> column is run through the pickle.loads() function but not the
>> first.   If we are registering various type-handling callables at the
>> cursor level, it should be easy enough to add an optional integer
>> parameter which will bind that type converter to only a specific
>> column position in a result set.  the use case is more obvious in the
>> bind parameter direction.
>
> Yes, I already suggested this (passing the column number to the  
> outbound
> adapter) as a possible extension. However, the use case is convincing
> enough that we should probably allow for a more convenient per-column
> mapping that allows dispatching the conversion to a different adapter
> callable altogether, rather than having to define one adapter that
> returns one thing or another depending on which column it's converter.
>
> To handle this, the cursor could grow a coltypemap attribute, which  
> is a
> mapping of typemaps, keyed on the column number or, maybe more
> conveniently, column name.

probably both.

>
> In summary, I am open to making the following revisions:
> * The SQLData class would become optional or be eliminated. Inbound  
> type
> conversions between Python objects and the database will be  
> performed by
> a well-defined ToDB method that the object may implement regardless of
> its inheritance tree. If an inbound Python object doesn't define a  
> ToDB
> method, it'll be mapped by the canonical mapping for the particular
> database.

yeah thats more or less what i was saying above.

> * The outbound conversion call will receive additional parameters,  
> such
> as the cursor.description tuple, that will allow the adapter to  
> make the
> resulting object stateful with respect to all of its database type
> properties.

its possible that cursor.description doesnt have all the information  
we need; such as, a string column that represents dates, and we need  
to decide what string format is represented in the column.

> * Add an optional coltypemap attribute to the cursor for defining a
> column-specific typemap.

yeah, just having various maps of typing information to me seems to  
represent the one method that is of general use for all cases.


From carsten at uniqsys.com  Sat Apr 21 22:10:00 2007
From: carsten at uniqsys.com (Carsten Haese)
Date: Sat, 21 Apr 2007 16:10:00 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
	<4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
Message-ID: <1177186201.3367.43.camel@localhost.localdomain>

On Sat, 2007-04-21 at 13:33 -0400, Michael Bayer wrote:
> On Apr 21, 2007, at 12:18 PM, Carsten Haese wrote:
> > Essentially, the proposed input translation could change to
> >
> > if hasattr(in_param, "ToDB"):
> >     in_param = in_param.ToDB()
> >
> 
> OK, duck typing is much better and more analgous to JDBC's usage of  
> an interface.   this solves the module-importing issue, but not  
> necessarily the "different db's might require different ToDB()  
> implementations" problem - it still binds my application-level value  
> objects to an assumption about their storage...and if my application  
> suddenly had to support two different databases, or even to persist  
> the same collections of objects in both of those DBs (there are  
> definitely apps that do this), now my program design has to create  
> copies of values to handle the discrepancy.   the same issue exists  
> for an application value that is stored in multiple places within the  
> same database, but in different ways; such as a Date type that is  
> stored both in some legacy table with a particuilar string-format  
> style of storage and some newer table with a decimal-based storage  
> format (or a different string format).
> 
> a behind-the-scenes registry of converters mapped to my application's  
> types solves the multiple-databases problem, and bind/column-mapped  
> converters solve the multiple-tables problem.
> 
> the non-class-bound approach, using registered converters, looks like:
> 
> converter = cursor.type_mappings.get(type(in_param), None)
> if converter is not None:
>      in_param = converter.ToDB(in_param)
> 
> that removes all interface responsibilities from in_param's class.

Okay, here we have reached the heart of the matter: Persisting an
application object in a database requires cooperation between the object
and the database. Either the object needs to know about the database, or
the database needs to know about the object.

The former can be done by the object having a ToDB method that is given
information about how the database that it'll be stored in, and react
appropriately. The latter can be done in the way you propose, using an
inbound typemap.

I'll concede that using an inbound typemap has a beautiful symmetry to
using an outbound typemap, and it's way less kludgy than making the
object aware of every single database that might want to store it.
However, the adapter lookup needs to be done in a way that doesn't
suddenly fail if the application object is subclassed! Doing this is
just a bit more involved:

for tp in type(in_param).__mro__:
  converter = cursor.input_typemap.get(tp, None)
  if converter is not None: break
if converter is not None:
  in_param = converter(in_param)

Note that it's enough if the converter is simply any callable object
that returns the converted object.

> > To handle this [column specific mapping], the cursor could grow a coltypemap attribute, which  
> > is a
> > mapping of typemaps, keyed on the column number or, maybe more
> > conveniently, column name.
> 
> probably both.

Yeah, I actually meant both :)

> >
> > In summary, I am open to making the following revisions:
> > * The SQLData class would become optional or be eliminated. Inbound  
> > type
> > conversions between Python objects and the database will be  
> > performed by
> > a well-defined ToDB method that the object may implement regardless of
> > its inheritance tree. If an inbound Python object doesn't define a  
> > ToDB
> > method, it'll be mapped by the canonical mapping for the particular
> > database.
> 
> yeah thats more or less what i was saying above.

In the meantime you've made me see the light that the SQLData base class
and the ToDB interface can be completely eliminated if we use an inbound
typemap for handling the translation from the application to the
database.

In light of this development, I propose the following changes to my
proposal:
* The SQLData class and the ToDB interface will be eliminated.
* The typemap attribute will be renamed to output_typemap.
* An analogous input_typemap will be added.

> > * The outbound conversion call will receive additional parameters,  
> > such
> > as the cursor.description tuple, that will allow the adapter to  
> > make the
> > resulting object stateful with respect to all of its database type
> > properties.
> 
> its possible that cursor.description doesnt have all the information  
> we need; such as, a string column that represents dates, and we need  
> to decide what string format is represented in the column.

And who or what, other than the programmer who can handle the situation
with a column-specific typemap, *would* have all the information that's
needed in that case?

> > * Add an optional coltypemap attribute to the cursor for defining a
> > column-specific typemap.
> 
> yeah, just having various maps of typing information to me seems to  
> represent the one method that is of general use for all cases.

I'm glad we're beginning to agree. Maybe down this road, consensus can
be found.

-Carsten


From unixdude at gmail.com  Sun Apr 22 21:03:34 2007
From: unixdude at gmail.com (Jim Patterson)
Date: Sun, 22 Apr 2007 15:03:34 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila>
	<4624DEFA.30501@research.att.com>
	<10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
Message-ID: <d8b213cb0704221203s6d004636x683e82add9ca952f@mail.gmail.com>

(Resend since I hit reply instead or reply-all)

Wow, I'm really glad to see this topic has garnered such great
response from everyone.  This make me really hopeful that we
can solve this issue.

On 4/20/07, Anthony Tuininga <anthony.tuininga at gmail.com> wrote:
>
> I've been following this thread and it would appear that no real
> consensus has been reached as of yet. I've looked at the api used by
> sqlite and knowing its storage and type definition system it makes
> good sense. I am considering adding the following to cx_Oracle,
> following some of the examples given so far with modifications needed
> for Oracle, and I'd appreciate any input you might have.
>
> cursor.setdefaulttype(databaseType, type)
> connection.setdefaulttype(databaseType, type)


I like the sounds of this.  What are you thinking the databaseType
parameter?  I can see using the standard database types that the
modules expose (STRING, BINARY, NUMBER, DATETIME,
and ROWID) along with extended types that are custom to the
database API.  If the extended types were derived from the
standard types, then a bit of code that was portable could use
STRING to say that all strings map to Unicode or whatever, and
code that used an advanced feature of a database could map
NVARCHAR to Unicode and map VARCHAR to non-Unicode.

registeradapter(type, databaseType, fromPythonMethod, toPythonMethod)
>
> This would specify that whenver an object of the given type is bound
> to a cursor, that the fromPythonMethod method would be invoked with
> the value and would expect a return value that can be directly bound
> to the databaseType. The toPythonMethod method would be invoked when
> columns are retrieved and would accept the databaseType value and
> expect back a value of the given type.


Is this mapped as a tuple of type and databaseType? Or is this mapping
saying that to get to/from type you used the database type on the db side
and the correct function based on direction?

Some help on the names would be appreciated as well -- its the worst
> part of programming. :-) I've tried to use the DB API style of naming
> -- all lower case without any underscores even though it isn't my
> personal favorite.


I kind of like Robert's suggestions below about using an exposed
mapping object and being explicit about the direction of the conversion.
So I would suggest names like "defaulttypefromdb" and "defaulttypetodb"
(again using the existing  db api naming style)

Jim Patterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/c3c5ed96/attachment.html 

From mike_mp at zzzcomputing.com  Sun Apr 22 22:11:55 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Sun, 22 Apr 2007 16:11:55 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <d8b213cb0704221242t188ecfffred1bf91b55f82e3a@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<1176840738.3735.18.camel@mila>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
	<4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
	<d8b213cb0704221242t188ecfffred1bf91b55f82e3a@mail.gmail.com>
Message-ID: <AD31786F-A991-465A-9BB6-BBA717CF509B@zzzcomputing.com>

(assuming this is meant for on-list)

On Apr 22, 2007, at 3:42 PM, Jim Patterson wrote:

> For the most part I have been thinking about simple type mappings,  
> but some
> of the examples raised discuss this kind of greater flexibility.  I  
> had been
> mostly thinking about the problem of dealing with the database  
> specific
> problem of mapping the database type to the Python universe.  A number
> can come back as an int/long, or a float, or a decimal, or a  
> complex number.
> How that is accomplished is very specific to the database.
>
> If I'm following the capability you are talking about with dates in  
> strings
> and pickled classes and jpegs then that should seem to me to be a
> layer on top of database specific part.  I see the hierarchy as:
>
> advanced type conversion library
> python dbapi module
> database provided api
> database
>
> It would seem that if we get enough power and flexibility in the
> dbapi type specifies then the "advanced type conversion library"
> can be common code that does not care what type of dbapi it
> is sitting on.  To handle jpegs or pickled classes it needs to be
> able to tell the dbapi that it wants to use BINARY objects.  To
> handle the dates in strings it needs to be able to tell the dbapi
> that it wants to use strings.

currently, you cant exactly write the "advanced type conversion  
library" in a totally dbapi-neutral way, because you cant always be  
sure that a DBAPI supports the native types used by a particular  
"advanced conversion" type.  in particular I mention dates because  
SQLite/pysqlite has no date type - you can *only* get a datetime  
object to/from a sqlite database using string formatting of some  
kind.  so if datestring parsing is part of a "layer on top of DBAPI",  
in the case of sqlite you need to use this layer, in the case of most  
other databases you dont.

another example would be an "advanced" type that relies upon array  
values.  lots of folks seem to like using Postgres' array type, a  
type which is not available in other DBs.  so such a type which  
depends on underlying arrarys would also need to vary its  
implementation depending on DBAPI.

Not that converting from binary->picklestream isnt something that  
should be performed externally to DBAPI...but because of the variance  
in available type support its hard to draw a crisp line between whats  
"on top" of DBAPI and whats not, which is why with dates in  
particular I put them in the "native" category, if for no other  
reason than sqlite's non-support of them (well, and also that dates  
are pretty darn important).

SQLAlchemy also expresses the "native type"/"advanced type" dichotomy  
explicitly.  For things like dates (which are non-standard to  
sqlite), binary objects (which return a specialized LOB object on  
oracle that is normalized to act like the other DBAPIs), numbers  
(which are returned as Decimal in postgres, floats in all others), SA  
implements whole modules of different TypeEngine implementations  
tailored to each supported DBAPI - these types form the "lower level"  
set of types.  The "translation on top of a type" operation is  
handled by subclasses of TypeDecorator, which references a TypeEngine  
(the lower level type base class) compositionally - currently  
PickleType is the only standard type within this second hierarchy.   
Other folks have also implemented Enums in this layer (which  
ironically is a native type in mysql).

So I guess the reason i conflate the "native"/"advanced" types is  
because from DBAPI to DBAPI theres no clear line as to what category  
a particular kind of type falls into.


From unixdude at gmail.com  Sun Apr 22 22:19:40 2007
From: unixdude at gmail.com (Jim Patterson)
Date: Sun, 22 Apr 2007 16:19:40 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <1177186201.3367.43.camel@localhost.localdomain>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
	<4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
	<1177186201.3367.43.camel@localhost.localdomain>
Message-ID: <d8b213cb0704221319p474a196bif0b1a4eb678e8bc9@mail.gmail.com>

On 4/21/07, Carsten Haese <carsten at uniqsys.com> wrote:
>
> In light of this development, I propose the following changes to my
> proposal:
> * The SQLData class and the ToDB interface will be eliminated.
> * The typemap attribute will be renamed to output_typemap.
> * An analogous input_typemap will be added.
>

I'm not a big fan of the terms input and output for a case like this
they can be confusing.  For example does in mean into Python
or into the database.  So I would prefer more explicit terms like
"fromdb" or "topython" (I would pick the to/from db since it is
shorter and just as clear).

Jim Patterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/d0fbdea0/attachment.html 

From carsten at uniqsys.com  Sun Apr 22 22:31:06 2007
From: carsten at uniqsys.com (Carsten Haese)
Date: Sun, 22 Apr 2007 16:31:06 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <d8b213cb0704221319p474a196bif0b1a4eb678e8bc9@mail.gmail.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
	<4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
	<1177186201.3367.43.camel@localhost.localdomain>
	<d8b213cb0704221319p474a196bif0b1a4eb678e8bc9@mail.gmail.com>
Message-ID: <1177273866.3236.5.camel@localhost.localdomain>

On Sun, 2007-04-22 at 16:19 -0400, Jim Patterson wrote:
> On 4/21/07, Carsten Haese <carsten at uniqsys.com> wrote:
>         In light of this development, I propose the following changes
>         to my
>         proposal:
>         * The SQLData class and the ToDB interface will be eliminated.
>         * The typemap attribute will be renamed to output_typemap.
>         * An analogous input_typemap will be added. 
> 
> I'm not a big fan of the terms input and output for a case like this
> they can be confusing.  For example does in mean into Python
> or into the database.  So I would prefer more explicit terms like 
> "fromdb" or "topython" (I would pick the to/from db since it is
> shorter and just as clear).

The wording was intended to be from the point of view of the database,
but I agree that it would eliminate confusion to make the direction
explicitly clear. I'm fine with fromdb_typemap and todb_typemap instead
of output_typemap and input_typemap, respectively.

-Carsten


From unixdude at gmail.com  Mon Apr 23 04:14:19 2007
From: unixdude at gmail.com (Jim Patterson)
Date: Sun, 22 Apr 2007 22:14:19 -0400
Subject: [DB-SIG] Controlling return types for DB APIs
In-Reply-To: <AD31786F-A991-465A-9BB6-BBA717CF509B@zzzcomputing.com>
References: <d8b213cb0704162105j700b19a7v8f6c2b19f720c385@mail.gmail.com>
	<703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local>
	<1177119242.3305.52.camel@localhost.localdomain>
	<1177139387.3208.6.camel@localhost.localdomain>
	<CCF31429-05E1-4494-8760-F25F75455EC5@zzzcomputing.com>
	<1177172281.3191.79.camel@localhost.localdomain>
	<4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com>
	<d8b213cb0704221242t188ecfffred1bf91b55f82e3a@mail.gmail.com>
	<AD31786F-A991-465A-9BB6-BBA717CF509B@zzzcomputing.com>
Message-ID: <d8b213cb0704221914q171d51cgb8e842c85b7e6efd@mail.gmail.com>

On 4/22/07, Michael Bayer <mike_mp at zzzcomputing.com> wrote:
>
> (assuming this is meant for on-list)


Yes, absolutely.  My mistake, I'm used to lists that place
themselves as the reply-to, and I did not check the to and
cc lists.

currently, you cant exactly write the "advanced type conversion
> library" in a totally dbapi-neutral way, because you cant always be
> sure that a DBAPI supports the native types used by a particular
> "advanced conversion" type.  in particular I mention dates because
> SQLite/pysqlite has no date type - you can *only* get a datetime
> object to/from a sqlite database using string formatting of some
> kind.  so if datestring parsing is part of a "layer on top of DBAPI",
> in the case of sqlite you need to use this layer, in the case of most
> other databases you dont.


I've never used SQLIte so I can only comment based on a
quick read of some of the docs, but my thinking is that
maybe the SQLite dbapi should be extended to provide
a DATETIME type for use by the Python programmer,
and them map in into some canonical string in the SQLite
apis.  It looks like some of that support already exists with
the PARSE_DECLTYPES and PARSE_COLTYPES, but
it seems to be missing the STRING, BINARY, NUMBER,
DATETIME, and ROWID types listed in PEP 249.  I'm
not fully sure that is needs them, but I know I've written
code to the DB API that will not work with SQLite since
it does not have them.

another example would be an "advanced" type that relies upon array
> values.  lots of folks seem to like using Postgres' array type, a
> type which is not available in other DBs.  so such a type which
> depends on underlying arrarys would also need to vary its
> implementation depending on DBAPI.


Given that arrays are not supported very well across
databases, I'm not sure that you can write portable code that
uses them.  Maybe we can define a set of types the must be
supported and set of types that are optional and then by
checking to see (at runtime) if the module exposes that type
this mythical "advanced" library could adjust itself.

Not that converting from binary->picklestream isnt something that
> should be performed externally to DBAPI...but because of the variance
> in available type support its hard to draw a crisp line between whats
> "on top" of DBAPI and whats not, which is why with dates in
> particular I put them in the "native" category, if for no other
> reason than sqlite's non-support of them (well, and also that dates
> are pretty darn important).


I look at them that way as well, but at least initially because
PEP 249 listed them as supported and because all the databases
I have used (a small set of the total that exist) all support it.

SQLAlchemy also expresses the "native type"/"advanced type" dichotomy
> explicitly.  For things like dates (which are non-standard to
> sqlite), binary objects (which return a specialized LOB object on
> oracle that is normalized to act like the other DBAPIs), numbers
> (which are returned as Decimal in postgres, floats in all others), SA
> implements whole modules of different TypeEngine implementations
> tailored to each supported DBAPI - these types form the "lower level"
> set of types.  The "translation on top of a type" operation is
> handled by subclasses of TypeDecorator, which references a TypeEngine
> (the lower level type base class) compositionally - currently
> PickleType is the only standard type within this second hierarchy.
> Other folks have also implemented Enums in this layer (which
> ironically is a native type in mysql).


I'm just hoping we can simplify some of this kind of stuff by
put more of it at the DBAPI level.  As you mentioned the real
question becomes where do you draw the line.  It is a tough
question.

I got started on this very topic since I wanted to draw the
line in a place other than where cx_Oracle had drawn
that line in the past.  It seemed to me that Unicode support
belonged in the DBAPI since it is somewhat hard to get
right with Oracle and the solution is VERY Oracle specific.
Setting the NLS_LANG environment variable wrong gets
you no or incorrect Unicode support.  I was also wanting
Decimal support since for me, I'm doing work with money
and floating point approximations of money is a really
scary thing.  I could have used the interface that cx_Oracle
supplied to always get numbers as strings and then done
the conversion myself, but I was nervous that someone
on my team would forget and it would cause problems.

Anthony was very willing to work with me to add support
for Unicode and Decimal so for me it was any easy redraw
of the line (Anthony was already planning the Unicode)

So I guess the reason i conflate the "native"/"advanced" types is
> because from DBAPI to DBAPI theres no clear line as to what category
> a particular kind of type falls into.


There seems to be as much confusion within the databases
themselves, so the best w may be able to do is broad support
for the common types and a way to tell if the module supports
the other types.

In order to write these advanced converters in a portable way
across which ever set of DBAPIs support the required type
we will still have to be able to tell the DBAPI that we need
the data in a standard python datatype so that it can be passed
around.  That might be a good starting point for above/below
the line.  If we look at the built-in type in the Python library
reference we can get a list of the python types that a developer
might want to use.  Some can be narrowed down to a single
option.  You most likely do not need iterators for example.

Jim Patterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/0b7804a5/attachment-0001.html