From carl at personnelware.com Sun Apr 1 22:52:13 2007 From: carl at personnelware.com (Carl Karsten) Date: Sun, 01 Apr 2007 15:52:13 -0500 Subject: [DB-SIG] dbf memo access Message-ID: <46101B7D.90000@personnelware.com> I need to read (not write) some dbf memo data (memo is dBase's datatype to store 'unlimited' text, much like a blob or text field.) to make things worse, various implementations of the dbf engine have different formats for storing memos - I forget what the dBaseIII file name was, but I currently need VFP's, which is filename.FPT (fox pro text) The few hits I got on google didn't make it clear if they supported any sort of dbf memo, let alone VFP's. I do most of my work in linux, so hoping for something other than the odbc way, but I can use that if it is my only choice. Carl K From mal at egenix.com Sun Apr 1 23:43:46 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 01 Apr 2007 23:43:46 +0200 Subject: [DB-SIG] dbf memo access In-Reply-To: <46101B7D.90000@personnelware.com> References: <46101B7D.90000@personnelware.com> Message-ID: <46102792.1050003@egenix.com> On 2007-04-01 22:52, Carl Karsten wrote: > I need to read (not write) some dbf memo data (memo is dBase's datatype to store > 'unlimited' text, much like a blob or text field.) to make things worse, > various implementations of the dbf engine have different formats for storing > memos - I forget what the dBaseIII file name was, but I currently need VFP's, > which is filename.FPT (fox pro text) > > The few hits I got on google didn't make it clear if they supported any sort of > dbf memo, let alone VFP's. > > I do most of my work in linux, so hoping for something other than the odbc way, > but I can use that if it is my only choice. If you only need to do this once, then using the MS FoxPro ODBC on Windows is the best and easiest way to extract the data. Some other references that might help: Read dBase3 in Python: http://cwashington.netreach.net/depo/view.asp?Index=102&ScriptType=python Read dBase and xBase files: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715 Not sure whether those two help with memos. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 01 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carl at personnelware.com Mon Apr 2 01:01:50 2007 From: carl at personnelware.com (Carl Karsten) Date: Sun, 01 Apr 2007 18:01:50 -0500 Subject: [DB-SIG] dbf memo access In-Reply-To: <46102792.1050003@egenix.com> References: <46101B7D.90000@personnelware.com> <46102792.1050003@egenix.com> Message-ID: <461039DE.5090600@personnelware.com> M.-A. Lemburg wrote: > On 2007-04-01 22:52, Carl Karsten wrote: >> I need to read (not write) some dbf memo data (memo is dBase's datatype to store >> 'unlimited' text, much like a blob or text field.) to make things worse, >> various implementations of the dbf engine have different formats for storing >> memos - I forget what the dBaseIII file name was, but I currently need VFP's, >> which is filename.FPT (fox pro text) >> >> The few hits I got on google didn't make it clear if they supported any sort of >> dbf memo, let alone VFP's. >> >> I do most of my work in linux, so hoping for something other than the odbc way, >> but I can use that if it is my only choice. > > If you only need to do this once, then using the MS FoxPro ODBC on Windows > is the best and easiest way to extract the data. > > Some other references that might help: > > Read dBase3 in Python: > http://cwashington.netreach.net/depo/view.asp?Index=102&ScriptType=python > elif type == 'M': # We ignore the memo field pass > Read dBase and xBase files: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715 > M for ascii character memo data (real memo fields not supported) Looks like I am going to have to figure out the odbc thing. Thanks for confirming that I wasn't missing something. Carl K From phd at phd.pp.ru Wed Apr 11 17:37:15 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 11 Apr 2007 19:37:15 +0400 Subject: [DB-SIG] SQLObject 0.7.5 Message-ID: <20070411153715.GC21003@phd.pp.ru> Hello! I'm pleased to announce the 0.7.5 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.5 News and changes: http://sqlobject.org/docs/News.html What's New ========== News since 0.7.4 ---------------- * Fixed a bug in DateValidator caused by datetime being a subclass of date. * Fixed test_deep_inheritance.py - setup classes in the correct order (required for Postgres 8.0+ which is strict about referential integrity). For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Wed Apr 11 17:53:19 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 11 Apr 2007 19:53:19 +0400 Subject: [DB-SIG] SQLObject 0.8.2 Message-ID: <20070411155319.GD21492@phd.pp.ru> Hello! I'm pleased to announce the 0.8.2 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.8.2 News and changes: http://sqlobject.org/News.html What's New ========== News since 0.8.1 ---------------- * Fixed ConnectionHub.doInTransaction() - if the original connection was processConnection - reset processConnection, not threadConnection. * Fixed a bug in DateValidator caused by datetime being a subclass of date. * Fixed test_deep_inheritance.py - setup classes in the correct order (required for Postgres 8.0+ which is strict about referential integrity). For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Wed Apr 11 18:27:18 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 11 Apr 2007 20:27:18 +0400 Subject: [DB-SIG] SQLObject 0.9.0b1 Message-ID: <20070411162718.GD22672@phd.pp.ru> Hello! I'm pleased to announce the 0.9.0b1 release of SQLObject, the first beta of the upcoming 0.9 release. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.9.0b1 News and changes: http://sqlobject.org/News.html What's New ========== News since 0.8 -------------- Features & Interface -------------------- * Support for Python 2.2 has been declared obsolete. * Removed actively deprecated attributes; lowered deprecation level for other attributes to be removed after 0.9. * SQLite connection got columnsFromSchema(). Now all connections fully support fromDatabase. There are two version of columnsFromSchema() for SQLite - one parses the result of "SELECT sql FROM sqlite_master" and the other uses "PRAGMA table_info"; the user can choose one over the other by using "use_table_info" parameter in DB URI; default is False as the pragma is available only in the later versions of SQLite. * Changed connection.delColumn(): the first argument is sqlmeta, not tableName (required for SQLite). * SQLite connection got delColumn(). Now all connections fully support delColumn(). As SQLite backend doesn't implement "ALTER TABLE DROP COLUMN" delColumn() is implemented by creating a new table without the column, copying all data, dropping the original table and renaming the new table. * Versioning_. .. _Versioning: Versioning.html * MySQLConnection got new keyword "conv" - a list of custom converters. * Use logging if it's available and is configured via DB URI. * New columns: TimestampCol to support MySQL TIMESTAMP type; SetCol to support MySQL SET type; TinyIntCol for TINYINT; SmallIntCol for SMALLINT; MediumIntCol for MEDIUMINT; BigIntCol for BIGINT. Small Features -------------- * Support for MySQL INT type attributes: UNSIGNED, ZEROFILL. * Support for DEFAULT SQL attribute via defaultSQL keyword argument. * Support for MySQL storage ENGINEs. * cls.tableExists() as a shortcut for conn.tableExists(cls.sqlmeta.table). * cls.deleteMany(), cls.deleteBy(). Bug Fixes --------- * idName can be inherited from the parent sqlmeta class. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From unixdude at gmail.com Tue Apr 17 03:26:57 2007 From: unixdude at gmail.com (Jim Patterson) Date: Mon, 16 Apr 2007 21:26:57 -0400 Subject: [DB-SIG] dbf memo access Message-ID: Carl, In the past, I needed to read a FoxPro file and the following worked: http://www.garshol.priv.no/download/software/python/dbfreader.p y Your milage may vary, Jim Patterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070416/240f3aa8/attachment.html From unixdude at gmail.com Tue Apr 17 06:05:16 2007 From: unixdude at gmail.com (Jim Patterson) Date: Tue, 17 Apr 2007 00:05:16 -0400 Subject: [DB-SIG] Controlling return types for DB APIs Message-ID: All, Over on the cx_Oracle list we have been discussing adding support for returning native Unicode strings and decimal objects. We have so far been talking about using a settable attribute on the connection and the cursor with the cursor inheriting the value from the connection by default. The is very similar to the existing technique used by cx_Oracle for the "numbersAsString" and the technique used by mxODBC for the "stringFormat" and "datetimeFormat". Anyone have any thoughts/feelings/opinions about moving towards standardizing how we do this kind of thing across the different database modules? Thanks in advance, Jim Patterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070417/a72b0878/attachment.html From fog at initd.org Tue Apr 17 09:44:19 2007 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 17 Apr 2007 09:44:19 +0200 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: Message-ID: <1176795859.3630.6.camel@mila> Il giorno mar, 17/04/2007 alle 00.05 -0400, Jim Patterson ha scritto: > Over on the cx_Oracle list we have been discussing adding support > for returning native Unicode strings and decimal objects. We have > so far been talking about using a settable attribute on the connection > and the cursor with the cursor inheriting the value from the > connection > by default. The is very similar to the existing technique used > by cx_Oracle for the "numbersAsString" and the technique used > by mxODBC for the "stringFormat" and "datetimeFormat". > > Anyone have any thoughts/feelings/opinions about moving towards > standardizing how we do this kind of thing across the different > database modules? psycopg's type system is one of its best features (and one loved by users I was told). At any time you can create a new "type" as nt = psycopg2.new_type((oid1, oid2, ...), "name", typecast_func) and then register it using "psycopg2.register_type(nt)". This has 2 effects: 1. data described by listed oids (this is PostgreSQL-specific, I know) is converted using the function "typecast_func"; and 2. you can use "nt" as a type object in comparaisons, just like other type object in the dbapi (STRING, NUMERIC, etc...) federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org Se consideri l'uso del software libero una concessione tu stesso, come potrai proporla agli altri? -- Nick Name -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/72dee7c3/attachment.pgp From mal at egenix.com Tue Apr 17 10:16:46 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 17 Apr 2007 10:16:46 +0200 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: Message-ID: <4624826E.5080703@egenix.com> On 2007-04-17 06:05, Jim Patterson wrote: > All, > > Over on the cx_Oracle list we have been discussing adding support > for returning native Unicode strings and decimal objects. We have > so far been talking about using a settable attribute on the connection > and the cursor with the cursor inheriting the value from the connection > by default. The is very similar to the existing technique used > by cx_Oracle for the "numbersAsString" and the technique used > by mxODBC for the "stringFormat" and "datetimeFormat". > > Anyone have any thoughts/feelings/opinions about moving towards > standardizing how we do this kind of thing across the different > database modules? While mxODBC does use this kind of approach, I don't think it's all that flexible, e.g. we have now added a new attribute .decimalformat for specifying whether you want floats or decimals for decimal database columns. Ideally, it should be possible to set converters for all kinds of output types as well as ones for input parameters. In some situations, it's also desirable to be able to do this based on the output variable or parameter position. This would, of course, only apply to cursors with already prepared statements. While this can be solved using a registry of types conversions, I see problems in standardizing the way to define the type mappings since different database backends tend to have or need different types. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 17 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From fog at initd.org Tue Apr 17 10:29:34 2007 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 17 Apr 2007 10:29:34 +0200 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <4624826E.5080703@egenix.com> References: <4624826E.5080703@egenix.com> Message-ID: <1176798574.3630.26.camel@mila> Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto: > > While this can be solved using a registry of types conversions, > I see problems in standardizing the way to define the type > mappings since different database backends tend to have > or need different types. I can see an API that leverages on the introspection abilities of the drivers, to abstract to the different type representations of the various backends. Let's suppose that a driver "knows" the type of a DB column, then we can ask it for an abstract "dbtype": dbtype = connection_object.getdbtype("SELECT 1 AS foo") where the query _must_ return a scalar from which the driver infers the type. Then the type can be used as a key in the registry. Obviously the conversion function will be backend-specific but I suppose the signature couldbe the same for all functions. Given the fact that the conversion happens inside a cursor and than the connection is available from the cursor object itself, something like: py_data = conversion_function(backend_data, cursor_object) Then we can at least make a standard for the registry methods. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org All programmers are optimists. -- Frederick P. Brooks, Jr. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/f1865f52/attachment.pgp From aprotin at research.att.com Tue Apr 17 16:51:38 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 17 Apr 2007 10:51:38 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1176798574.3630.26.camel@mila> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> Message-ID: <4624DEFA.30501@research.att.com> Folks, This conversation is excellent. I have also experienced the need to extend our driver in non-standard ways in support of datatypes (as well as in other dimensions.) I added an attribute to cursors .datetime_type to indicate whether to leave them as strings (as converted by our DBMS on output) or to convert them to datetime objects, indicated by the values 'string' and 'object' respectively. The cursor objects inherit their initial setting of this attribute from the attribute .default_datetime_type on the driver module. I also added another attribute on the cursor which, after any query, has a list of strings, one per column, with the type names as they were reported by the DBMS. Federico Di Gregorio wrote: >Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto: > > >>While this can be solved using a registry of types conversions, >>I see problems in standardizing the way to define the type >>mappings since different database backends tend to have >>or need different types. >> >> > >I can see an API that leverages on the introspection abilities of the >drivers, to abstract to the different type representations of the >various backends. Let's suppose that a driver "knows" the type of a DB >column, then we can ask it for an abstract "dbtype": > >dbtype = connection_object.getdbtype("SELECT 1 AS foo") > >where the query _must_ return a scalar from which the driver infers the >type. > I do not see reasons for (1) why this is a connection level method, and (2) why the query would need to be limited to returning a scalar. This seems to be getting the same information that my second extension provides. What am I missing here? > Then the type can be used as a key in the registry. > Yes, something nice and simple, like a dict using the string name of the DBMS native datatype as the index. However, this might not work out after all. Our database system has an nearly unbounded set of types. The types have two components, say a major and minor, or a main type and subtype. The main type "STRING" alone has 65535 subtypes (one for each allowable size). Other main types may a few subtypes or even no subtypes Some of the subtypes make a major difference in the conversion function behavior (like those for DATE) and some make nearly none. My conversion routines are called based on the main type but need both the data value and the subtype as arguments. Do any of the other systems have such a multi-level type scheme as this? > Obviously the >conversion function will be backend-specific but I suppose the signature >couldbe the same for all functions. Given the fact that the conversion > > >happens inside a cursor and than the connection is available from the >cursor object itself, something like: > >py_data = conversion_function(backend_data, cursor_object) > >Then we can at least make a standard for the registry methods. > >federico > > > However, another issue is discovering the available conversion functions and determining their arguments. Thank you all, Art Protin >------------------------------------------------------------------------ > >_______________________________________________ >DB-SIG maillist - DB-SIG at python.org >http://mail.python.org/mailman/listinfo/db-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070417/fcfadaf3/attachment.html From fumanchu at amor.org Tue Apr 17 19:35:10 2007 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 17 Apr 2007 10:35:10 -0700 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1176798574.3630.26.camel@mila> Message-ID: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local> Federico Di Gregorio wrote: > Il giorno mar, 17/04/2007 alle 10.16 +0200, M.-A. Lemburg ha scritto: > > > > While this can be solved using a registry of types conversions, > > I see problems in standardizing the way to define the type > > mappings since different database backends tend to have > > or need different types. > > I can see an API that leverages on the introspection abilities of the > drivers, to abstract to the different type representations of the > various backends. Perhaps, but "type" and "representation" are two different concepts. For example, I've got an SQL Server DB (whose schema I can't change) which stores most dates in proper DATETIME columns, but there are 3 or 4 which store dates in CHAR(8) columns in 'YYYYMMDD' format. Although dbtype may *imply* pytype and vice-versa, there will always be cases where the adaptation layer between the two is context-dependent. Solving for arbitrary SQL (like "WHERE mytable.yyyymmdd_birthdate > now()") therefore means: 1. Knowing the desired Python type for each column (datetime.date), 2. Knowing the actual database type for each column or subexpression (SQL Server's CHAR type), 3. Having inbound and outbound scalar transformers (datetime.date-to-YYYYMMDD and datetime.date-from-YYYYMMDD), 4. Knowing which binary and comparison operations have implicit conversions, and 5. Special-casing binary and comparison operations between types which have no implicit conversions. For example, my YYYYMMDD adapter/converter has the method: def compare_op(self, op1, op, sqlop, op2): if isinstance(op2.dbtype, sqlserver.DATETIME): # Cast the YYYYMMDD string to a DATETIME. sql = ("((CASE WHEN ISDATE(%s)=1 " "THEN CAST(%s AS DATETIME) " "ELSE NULL END) %s %s)" % (op1.sql, op1.sql, sqlop, op2.sql)) return sql return "(%s %s %s)" % (op1.sql, sqlop, op2.sql) binary_op = compare_op This is where Geniusql is headed. I'm not for a moment saying the DBAPI should go that far, but there needs to be a clear understanding of exactly how far the DBAPI is going to go down this rabbit hole (because however far you go, your user base will forever pester you for the next level of flexibility ;). > Let's suppose that a driver "knows" the type of a DB > column, then we can ask it for an abstract "dbtype": > > dbtype = connection_object.getdbtype("SELECT 1 AS foo") > > where the query _must_ return a scalar from which the driver > infers the type. Then the type can be used as a key in the > registry. Given the extremely small number of datatypes that each commercial database exposes, this seems to be both more work and less accurate results than simply modeling each concrete dbtype directly. All SQL92-compliant types can be fully described with a handful of attributes (bytes, precision, scale, whether each of those is user-specifiable, and if so the maximum allowed value for each, whether a numeric type is signed or unsigned, and finally the CHAR vs VARCHAR distinction). [1] > Obviously the conversion function will be backend- > specific but I suppose the signature could be the same for > all functions. Given the fact that the conversion happens > inside a cursor and than the connection is available from the > cursor object itself, something like: > > py_data = conversion_function(backend_data, cursor_object) > > Then we can at least make a standard for the registry methods. There should be some provision for custom converters (which forces you to stick the converters on each column object instead of in a registry, since there can be several different converters for e.g. datetime.date-to-CHAR). But even if you decide not to go that far, the registry of default converters will need to be keyed by (pytype, backend-specific dbtype). For example, Postgres has a hard time comparing FLOAT4 and FLOAT8 [2], not to mention that the concrete precision of SQL92 REAL and DOUBLE are "implementation defined". It's not entirely hopeless; some base classes for converters can be constructed [3]. Robert Brewer System Architect Amor Ministries fumanchu at amor.org [1] See http://projects.amor.org/geniusql/browser/trunk/geniusql/dbtypes.py for my mostly-finished crack at this, plus any module in http://projects.amor.org/geniusql/browser/trunk/geniusql/providers for concrete DB types. Note I stick default_pytype directly on both abstract and concrete dbtype objects, but an external registry would be just as easy. [2] ...because the implicit conversion isn't always what you want; see http://archives.postgresql.org/pgsql-bugs/2004-02/msg00062.php for an example. [3] See http://projects.amor.org/geniusql/browser/trunk/geniusql/adapters.py From mike_mp at zzzcomputing.com Tue Apr 17 19:53:31 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 17 Apr 2007 13:53:31 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <4624DEFA.30501@research.att.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> Message-ID: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> On Apr 17, 2007, at 10:51 AM, Art Protin wrote: > Yes, something nice and simple, like a dict using the string name > of the DBMS native > datatype as the index. However, this might not work out after > all. Our database > system has an nearly unbounded set of types. The types have two > components, say a > major and minor, or a main type and subtype. The main type > "STRING" alone > has 65535 subtypes (one for each allowable size). Other main types > may a few > subtypes or even no subtypes Some of the subtypes make a major > difference in > the conversion function behavior (like those for DATE) and some > make nearly none. > My conversion routines are called based on the main type but need > both the data > value and the subtype as arguments. Do any of the other systems have > such a multi-level type scheme as this? > im not following this thread so closely, but SQLAlchemy does have a configurable type system which can represent both the "major" type as you call it, plus any number of arguments for each type (which youd call the "minor" type), for any given result set column. The "major" part is represented by the particular subclass of TypeEngine used, such as SLDateTime ( a date time type as represented in SQLite), and the "minor" part by the state of that particular TypeEngine instance (such as the length of a string column, or its encoding). Of course SQLAlchemy is a significant layer on top of DBAPI, so as far as the "registry"-like functionality of what types map to what columns, its achieved via the presence of Table objects which are comprised of collections of Columns each with their own TypeEngine instance. if DBAPI contained its own type-registry like system (which would likely be per-connection, since thats the highest-level object DBAPI provides which is still stateful with regards to a particular database connection), SA could probably modify TypeEngine to move its type-conversion code into this layer, instead of having to piggyback the translation onto result set objects. However i might suggest that this whole thread, "controlling return types", perhaps be expanded to include "controlling *input* types and return types", since to me (as well as to SQLAlchemy) being able to send an arbitrarily-typed python object into a bind parameter is just the mirror image of receiving a result column as an arbitrarily-typed python object. I think it would be unfortunate if only one half of the coin were addressed. From Chris.Clark at ingres.com Tue Apr 17 20:43:30 2007 From: Chris.Clark at ingres.com (Chris Clark) Date: Tue, 17 Apr 2007 11:43:30 -0700 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> Message-ID: <46251552.1030709@ingres.com> Michael Bayer wrote: > ........However i might suggest that this whole thread, "controlling return > types", perhaps be expanded to include "controlling *input* types and > return types"........ The pysqlite register_adapter and register_converter do this (in that order). I've not used it in anger but I like the simplicity of the API, see http://initd.org/pub/software/pysqlite/doc/usage-guide.html#converting-sqlite-values-to-custom-python-types for an example. pysqlite approach does have implications for the SQL that is then used but that isn't relevant to other databases so the api/approach could be appropriated :-) I'm not familiar with psycopg's approach to be able to compare with pysqlite without research. The big question is when should mapping take place? On the connection level, statement level, or column level. E.g. mapping varchar to a python varchar type that contains length and data connection level - all varchars in DBMS get returned get returned as varchar python type statement level - all varchars in specific .execute() get returned as varchar python type column level - only specific varchars in specific .execute() get returned as varchar python type, i.e. some varchars could come back as regular python strings The pysqlite approach is to use strings as type identifiers for the converter, I get the impression Art is not in favor if this approach but the dbms type param is likely to be DBMS specific so whether it is a string, a tuple of major/minor, etc. may not matter. If the driver offers constants for basic types that would be fine. Here is an example following the pysqlite approach, using connection level mapping. It is usually easy to discuss what is good/bad about an idea if there is an example to critic: import mydbapi # Register the python2db adapter mydbapi.register_adapter(mydbapi.varcharClass, mydbapi.adapt_varchar) # Register the db2python converter mydbapi.register_converter(mydbapi.varcharType, mydbapi.convert_varchar) con = mydbapi.connect(".......", use_type_mapping=TRUE) cur = con.cursor() cur.execute("select myvarchar from mytable") cur.fetch()[0] ## type would be Python type "mydbapi.varcharClass" Chris From fog at initd.org Tue Apr 17 22:08:47 2007 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 17 Apr 2007 22:08:47 +0200 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A860B13BFD4@ex9.hostedexchange.local> Message-ID: <1176840527.3735.14.camel@mila> Il giorno mar, 17/04/2007 alle 10.35 -0700, Robert Brewer ha scritto: [...] > This is where Geniusql is headed. I'm not for a moment saying the DBAPI > should go that far, but there needs to be a clear understanding of > exactly how far the DBAPI is going to go down this rabbit hole (because > however far you go, your user base will forever pester you for the next > level of flexibility ;). Rewriting the SQL is, imho, not in the scope of the DBAPI. When I talk about casting types from SQL to Python and back I mean exactly that, not some (probably very useful) way to automatically generate SQL to patch the shortcomings of the backend/previous programmer. > > > Let's suppose that a driver "knows" the type of a DB > > column, then we can ask it for an abstract "dbtype": > > > > dbtype = connection_object.getdbtype("SELECT 1 AS foo") > > > > where the query _must_ return a scalar from which the driver > > infers the type. Then the type can be used as a key in the > > registry. > > Given the extremely small number of datatypes that each commercial > database exposes, this seems to be both more work and less accurate > results than simply modeling each concrete dbtype directly. All > SQL92-compliant types can be fully described with a handful of > attributes (bytes, precision, scale, whether each of those is > user-specifiable, and if so the maximum allowed value for each, whether > a numeric type is signed or unsigned, and finally the CHAR vs VARCHAR > distinction). [1] Yes, but there should be a way to obtain the "type" from the backend. Especially for backends like PostgreSQL that allow for user defined types of any complexity. > There should be some provision for custom converters (which forces you > to stick the converters on each column object instead of in a > registry, > since there can be several different converters for e.g. > datetime.date-to-CHAR). > > But even if you decide not to go that far, the registry of default > converters will need to be keyed by (pytype, backend-specific dbtype). > For example, Postgres has a hard time comparing FLOAT4 and FLOAT8 [2], > not to mention that the concrete precision of SQL92 REAL and DOUBLE are > "implementation defined". It's not entirely hopeless; some base classes > for converters can be constructed [3]. You're talking about two different things here. I am not interested at all at solving at the Python level problems that inerently live at the backend level, like the FLOAT4/FLOAT8 problem. Imho all that the DBAPI need is a way to specify type-casting functions (at global, connection and maybe column level) that allow the programmer to obtain exactly the Python type they need, given a backend type. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org - Ma cos'ha il tuo pesce rosso, l'orchite? - Si, ha un occhio solo, la voce roca e mangia gli altri pesci. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/4244ba3a/attachment-0001.pgp From fog at initd.org Tue Apr 17 22:12:18 2007 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 17 Apr 2007 22:12:18 +0200 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> Message-ID: <1176840738.3735.18.camel@mila> Il giorno mar, 17/04/2007 alle 13.53 -0400, Michael Bayer ha scritto: > However i might suggest that this whole thread, "controlling return > types", perhaps be expanded to include "controlling *input* types and > return types", since to me (as well as to SQLAlchemy) being able to > send an arbitrarily-typed python object into a bind parameter is just > the mirror image of receiving a result column as an arbitrarily-typed > python object. I think it would be unfortunate if only one half of > the coin were addressed. Sure. ;) The Python->SQL part is perfect for adaptation and, for example, psycopg has a micro-protocols implementation to help with the adaptation of any Python object into a valid ISQLQuote one. Why two different systems? Because transating Python to SQL and SQL to Python are two very different operations, IMHO. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org La felicit? ? una tazza di cioccolata calda. Sempre. -- Io -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20070417/f47dfed5/attachment.pgp From fumanchu at amor.org Tue Apr 17 22:24:07 2007 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 17 Apr 2007 13:24:07 -0700 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <4624DEFA.30501@research.att.com> Message-ID: <435DF58A933BA74397B42CDEB8145A860B1BCE17@ex9.hostedexchange.local> Art Protin wrote: > Our database system has an nearly unbounded set of types. > The types have two components, say a major and minor, > or a main type and subtype. The main type "STRING" alone > has 65535 subtypes (one for each allowable size). > Other main types may a few subtypes or even no subtypes > Some of the subtypes make a major difference in the > conversion function behavior (like those for DATE) > and some make nearly none. My conversion routines are > called based on the main type but need both the data > value and the subtype as arguments. Do any of the other > systems have such a multi-level type scheme as this? Geniusql does, but it's implemented by using classes to represent each "main type" and instances (with varying attributes) for each subtype. Each Column object gets its own DatabaseType instance (and so does each expression when generating SQL). For example, the firebird.py provider supplies a concrete VARCHAR class (a subclass of the abstract dbtypes.SQL92VARCHAR class), and instances of that class can have a "bytes" attribute anywhere from 1 to 32767 (the default is 63): class VARCHAR(dbtypes.SQL92VARCHAR): synonyms = ['CHARACTER VARYING', 'CHAR VARYING', 'VARYING'] max_bytes = 32767 _bytes = 63 Given sufficient parameterization, there's usually no need to statically model all 65535 STRING types in systems like these. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From mike_mp at zzzcomputing.com Wed Apr 18 02:50:32 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 17 Apr 2007 20:50:32 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1176840738.3735.18.camel@mila> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> Message-ID: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> On Apr 17, 2007, at 4:12 PM, Federico Di Gregorio wrote: > > The Python->SQL part is perfect for adaptation and, for example, > psycopg > has a micro-protocols implementation to help with the adaptation of > any > Python object into a valid ISQLQuote one. Why two different systems? > Because transating Python to SQL and SQL to Python are two very > different operations, IMHO. they are different operations, but their semantics are mirror images of one another. it follows very closely if you are binding a datetime object to a bind param which is in a SQL expression being compared against a particular column, that the rules which convert the datetime to a SQL value are the same rules in reverse that would convert the selection of that column in the result. this is why i think the "type" of a column, both its bind param adaptation as well as its result row adaptation, can be expressed by the same rule object in most cases. and I know it works since this is how sqlalchemy has been doing it for quite a while now. From anthony.tuininga at gmail.com Wed Apr 18 19:35:36 2007 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Wed, 18 Apr 2007 11:35:36 -0600 Subject: [DB-SIG] cx_Oracle 4.3.1 Message-ID: <703ae56b0704181035j7d165c45pa1bd591af9ba2a20@mail.gmail.com> What is cx_Oracle? cx_Oracle is a Python extension module that allows access to Oracle and conforms to the Python database API 2.0 specifications with a few exceptions. Where do I get it? http://starship.python.net/crew/atuining What's new? 1) Ensure that if the client buffer size exceeds 4000 bytes that the server buffer size does not as strings may only contain 4000 bytes; this allows handling of multibyte character sets on the server as well as the client. 2) Added support for using buffer objects to populate binary data and made the Binary() constructor the buffer type as requested by Ken Mason. 3) Fix potential crash when using full optimization with some compilers. Thanks to Aris Motas for noticing this and providing the initial patch and to Amaury Forgeot d'Arc for providing an even simpler solution. 4) Pass the correct charset form in to the write call in order to support writing to national character set LOB values properly. Thanks to Ian Kelly for noticing this discrepancy. Anthony Tuininga From anthony.tuininga at gmail.com Fri Apr 20 18:31:52 2007 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Fri, 20 Apr 2007 10:31:52 -0600 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> Message-ID: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> I've been following this thread and it would appear that no real consensus has been reached as of yet. I've looked at the api used by sqlite and knowing its storage and type definition system it makes good sense. I am considering adding the following to cx_Oracle, following some of the examples given so far with modifications needed for Oracle, and I'd appreciate any input you might have. cursor.setdefaulttype(databaseType, type) connection.setdefaulttype(databaseType, type) What this method would do is specify that whenever an item that is represented on the database by the given database type is to be retrieved, the specified type should be used instead of the default. This would allow for a global or local specification that numbers are to be returned as strings or decimal.Decimal objects or that strings are to be returned as unicode objects, for example. cursor.settype(position, type) This would allow specification of the type to use for a particular column being fetched. registeradapter(type, databaseType, fromPythonMethod, toPythonMethod) This would specify that whenver an object of the given type is bound to a cursor, that the fromPythonMethod method would be invoked with the value and would expect a return value that can be directly bound to the databaseType. The toPythonMethod method would be invoked when columns are retrieved and would accept the databaseType value and expect back a value of the given type. Some help on the names would be appreciated as well -- its the worst part of programming. :-) I've tried to use the DB API style of naming -- all lower case without any underscores even though it isn't my personal favorite. Any comments? On 4/17/07, Michael Bayer wrote: > > On Apr 17, 2007, at 4:12 PM, Federico Di Gregorio wrote: > > > > > The Python->SQL part is perfect for adaptation and, for example, > > psycopg > > has a micro-protocols implementation to help with the adaptation of > > any > > Python object into a valid ISQLQuote one. Why two different systems? > > Because transating Python to SQL and SQL to Python are two very > > different operations, IMHO. > > they are different operations, but their semantics are mirror images > of one another. it follows very closely if you are binding a > datetime object to a bind param which is in a SQL expression being > compared against a particular column, that the rules which convert > the datetime to a SQL value are the same rules in reverse that would > convert the selection of that column in the result. this is why i > think the "type" of a column, both its bind param adaptation as well > as its result row adaptation, can be expressed by the same rule > object in most cases. and I know it works since this is how > sqlalchemy has been doing it for quite a while now. > > > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > http://mail.python.org/mailman/listinfo/db-sig > From Chris.Clark at ingres.com Fri Apr 20 19:28:03 2007 From: Chris.Clark at ingres.com (Chris Clark) Date: Fri, 20 Apr 2007 10:28:03 -0700 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> Message-ID: <4628F823.2020009@ingres.com> Anthony Tuininga wrote: > I've been following this thread and it would appear that no real > consensus has been reached as of yet. I've looked at the api used by > sqlite and knowing its storage and type definition system it makes > good sense. I am considering adding the following to cx_Oracle, > following some of the examples given so far with modifications needed > for Oracle, and I'd appreciate any input you might have. > > cursor.setdefaulttype(databaseType, type) > connection.setdefaulttype(databaseType, type) > > What this method would do is specify that whenever an item that is > represented on the database by the given database type is to be > retrieved, the specified type should be used instead of the default. > This would allow for a global or local specification that numbers are > to be returned as strings or decimal.Decimal objects or that strings > are to be returned as unicode objects, for example. > > cursor.settype(position, type) > > This would allow specification of the type to use for a particular > column being fetched. > > registeradapter(type, databaseType, fromPythonMethod, toPythonMethod) > > This would specify that whenver an object of the given type is bound > to a cursor, that the fromPythonMethod method would be invoked with > the value and would expect a return value that can be directly bound > to the databaseType. The toPythonMethod method would be invoked when > columns are retrieved and would accept the databaseType value and > expect back a value of the given type. > > Some help on the names would be appreciated as well -- its the worst > part of programming. :-) I've tried to use the DB API style of naming > -- all lower case without any underscores even though it isn't my > personal favorite. > > Any comments? My initial reaction is that I like it! I like registeradapter() and that it is easy to set at the connection or cursor level. I'm guessing that the cursor.settype() call is for a result set only and that the adapter would be reset on a new cursor.execute() call? I'm also wondering if setdefaulttype() should have param 2 as an optional param (i.e. if there is only one registered adapter the driver can work out the 2nd param). It is probably worth defining a conflict resolution approach, even if the approach is in documentation and says, "the behavior of conflicting types in adapters is undefined"! E.g. sending to db conflict (note fairly artificial): registeradapter(str, DECIMAL, pyStr2dbDec, dbDec2pyStr) registeradapter(str, SPATIAL, pyStr2dbSpa, dbSpa2pyStr) cursor.setdefaulttype(DECIMAL, str) cursor.setdefaulttype(SPATIAL, str) cursor.execute('select x from mytable where mytable.col1 = ?', ('12.34',)) ## is the input supposed to be decimal or a spatial type? The alternatives are: 1. for the database driver to do some sort of DESCRIBE INPUT and work out which adapter to use 2. to raise an error when registeradapter() is called with conflicting types Any comments? Should this be driver dependent? As for names, I've a few suggestions but I don't feel strongly about the names: setdefaulttype --> coercetype settype --> coercecolumn registeradapter() is clear, I wondered about setadapter() instead, but registeradapter() is probably the most clear. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070420/dd4f0d1a/attachment.htm From anthony.tuininga at gmail.com Fri Apr 20 21:57:03 2007 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Fri, 20 Apr 2007 13:57:03 -0600 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <4628F823.2020009@ingres.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <4628F823.2020009@ingres.com> Message-ID: <703ae56b0704201257x23a0eecfta09365a241078f5b@mail.gmail.com> On 4/20/07, Chris Clark wrote: > > Anthony Tuininga wrote: > I've been following this thread and it would appear that no real > consensus has been reached as of yet. I've looked at the api used by > sqlite and knowing its storage and type definition system it makes > good sense. I am considering adding the following to cx_Oracle, > following some of the examples given so far with modifications needed > for Oracle, and I'd appreciate any input you might have. > > cursor.setdefaulttype(databaseType, type) > connection.setdefaulttype(databaseType, type) > > What this method would do is specify that whenever an item that is > represented on the database by the given database type is to be > retrieved, the specified type should be used instead of the default. > This would allow for a global or local specification that numbers are > to be returned as strings or decimal.Decimal objects or that strings > are to be returned as unicode objects, for example. > > cursor.settype(position, type) > > This would allow specification of the type to use for a particular > column being fetched. > > registeradapter(type, databaseType, fromPythonMethod, toPythonMethod) > > This would specify that whenver an object of the given type is bound > to a cursor, that the fromPythonMethod method would be invoked with > the value and would expect a return value that can be directly bound > to the databaseType. The toPythonMethod method would be invoked when > columns are retrieved and would accept the databaseType value and > expect back a value of the given type. > > Some help on the names would be appreciated as well -- its the worst > part of programming. :-) I've tried to use the DB API style of naming > -- all lower case without any underscores even though it isn't my > personal favorite. > > Any comments? > > My initial reaction is that I like it! I like registeradapter() and that it > is easy to set at the connection or cursor level. Ok, that's good. :-) > I'm guessing that the cursor.settype() call is for a result set only and > that the adapter would be reset on a new cursor.execute() call? Mostly correct. Yes, it is for result sets only although the possibilty of donig the same for output bind variables should be considered as well -- but that would require a different method signature or a change to setinputsizes(). The setting would remain so long as the same statement was executed again. When a new statement is prepared, this setting would revert to the default values (defined by the setdefaulttype() calls). > I'm also wondering if setdefaulttype() should have param 2 as an optional > param (i.e. if there is only one registered adapter the driver can work out > the 2nd param). Well, if there was only one type, there wouldn't be much point in calling setdefaulttype() would there? The default would of course be the only one available and saying so wouldn't make it any more so... :-) Unless I'm missing something and you meant something else? > It is probably worth defining a conflict resolution approach, even if the > approach is in documentation and says, "the behavior of conflicting types in > adapters is undefined"! E.g. sending to db conflict (note fairly > artificial): > > > registeradapter(str, DECIMAL, pyStr2dbDec, dbDec2pyStr) > registeradapter(str, SPATIAL, pyStr2dbSpa, dbSpa2pyStr) > cursor.setdefaulttype(DECIMAL, str) > cursor.setdefaulttype(SPATIAL, str) > cursor.execute('select x from mytable where mytable.col1 = ?', ('12.34',)) > ## is the input supposed to be decimal or a spatial type? > > The alternatives are: > > > > for the database driver to do some sort of DESCRIBE INPUT and work out which > adapter to use > > to raise an error when registeradapter() is called with conflicting types > Any comments? Should this be driver dependent? Hmm, I was assuming that the adapters would be indexed by type and that setting one would override the previous one. In other words, to do what you were hoping to do would require a different Python type for decimal and spatial. Having the database figure out which type to use would be difficult at best and impossible in some situations. In addition, the situation you give is a bind variable which is normally controlled by setinputsizes(). I wasn't intending to change the signature of this method but perhaps a new method should be added? cursor.setbindtype(nameOrPosition, databaseType, [type]) which would allow you to specify the database type and optionally the Python type (for output bind variables). At that point the same rules as described above would apply. This is overlap with setinputsizes() though so I'm not sure whether or not this is a good idea. > As for names, I've a few suggestions but I don't feel strongly about the > names: > > > setdefaulttype --> coercetype > settype --> coercecolumn Hmm, I think I prefer the setXXX() type methods as they are similar in content to the setinputsizes() and setoutputsizes() methods. But I can see your point about coerceXXX() as that is in fact what is happening. > registeradapter() is clear, I wondered about setadapter() instead, but > registeradapter() is probably the most clear. I think registeradapter() is clearer, too. > Chris > > From fumanchu at amor.org Sat Apr 21 02:28:04 2007 From: fumanchu at amor.org (Robert Brewer) Date: Fri, 20 Apr 2007 17:28:04 -0700 Subject: [DB-SIG] Controlling return types for DB APIs References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila><4624DEFA.30501@research.att.com><10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com><1176840738.3735.18.camel@mila><6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> Message-ID: <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> Anthony Tuininga wrote: > cursor.setdefaulttype(databaseType, type) > connection.setdefaulttype(databaseType, type) > > What this method would do is specify that whenever > an item that is represented on the database by the > given database type is to be retrieved, the specified > [Python] type should be used instead of the default. > This would allow for a global or local specification > that numbers are to be returned as strings or > decimal.Decimal objects or that strings are to be > returned as unicode objects, for example. > > cursor.settype(position, type) > > This would allow specification of the type to use > for a particular column being fetched. As soon as you provide cursor.settype(dbtype, pytype), you should expect someone to ask for "pytype = cursor.gettype(dbtype)", and once you've written that, you'll realize they're both spelled better as "cursor.types[dbtype] = pytype" (for set) and "pytype = cursor.types[dbtype]" (for get). Same thing goes for adapters: once you allow people to set them, you should expect people will want to inspect them. So a cursor.adapters object (copied from a similar connection.adapters object) should be used. Namespaces are one honking great idea. The container classes don't *have* to use slicing (you could go all Java-ish and use get() and set()), but slicing is the most natural choice in this case. You also should be very explicit about the direction of each operation when building type- or language- translation layers. That is, include the direction in the names of every method and object, because at some point, people will want to do the reverse. A Python method named "coerce" is not as good as a method named "coerce_in" or "coerce_from_database"; the namespaced spelling, "adapter_in.coerce" is even better. > registeradapter(type, databaseType, fromPythonMethod, > toPythonMethod) > > This would specify that whenver an object of the given > [Python] type is bound to a cursor, that the > fromPythonMethod method would be invoked with the > value and would expect a return value that can be > directly bound to the databaseType. The toPythonMethod > method would be invoked when columns are retrieved and > would accept the databaseType value and expect back a > value of the given type. That sounds good if by "databaseType" you mean "class VARCHAR"; that is, a Python type which models a database type. Because there's no such thing as a "database value" you can pass to a toPythonMethod; it *must* have already been converted into some Python object of a Python type (unless you're writing your adapters in C). The best you can do is have a Python type (designed/selected for minimum information loss) to which the incoming value gets coerced before passing it to your toPythonMethod for further adaptation or casting. Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070420/10695523/attachment.htm From carsten at uniqsys.com Sat Apr 21 03:34:02 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Fri, 20 Apr 2007 21:34:02 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> Message-ID: <1177119242.3305.52.camel@localhost.localdomain> Now that it's the weekend, I'd like to chime in. I had been thinking for a while now about type conversions from the Informix angle. Since Informix allows user-defined types, I'd like to implement a type conversion scheme that is flexible enough to allow specifying type conversions for any data type including UDTs. Since I didn't consider myself creative enough to come up with a good design by myself, I looked over the fence at other database implementations, including PostgreSQL and sqlite, and to my utter surprise, the one scheme that seems most Pythonic to me is the way JDBC handles user-defined types. The idea is that the programmer sets up type mappings. In Python, a type mapping would just be a dictionary whose key is the database type (more on that later) and whose value is a class object that derives from an abstract SQLData class that would be defined by the API module. This mapping would be stored on the connection as a default mapping, and the connection's cursors will inherit shallow copies of this mapping. A natural choice for the key in this mapping is the type indicator that the API implementation already returns in cursor.description. The only possible hangup would be if a DB-API implementation uses mutable objects for these, but in my opinion that would be insane. All implementations I'm aware of either use strings or integers for the SQL type indicator. When a value is returned from the database, the computer checks if its type is mapped. If yes, the constructor of the corresponding SQLData-derived class is called with the value's "canonical" Python representation as the only argument. The canonical representation is the value that the API would return if no type map were in effect, which would be the best, lossless Python equivalent of the data type in question. The SQLData-derived class may, of course, return an object of a different type of object from its __new__ method (which would be useful to map character data to unicode objects, for example), but in order to allow seamless round-trips of data from the database to the application and back to the database, the returned value should be directly usable as an input parameter for the type of column that it came from. For handling type conversions to the database, SQLData instances would implement a ToDB method that would perform the reverse operation of the constructor, i.e. to render the canonical Python representation of the instance's contents, which can then be bound to input parameters in the canonical way. This proposal does not address special per-column mappings, but I don't think it needs to. In my experience it's rare that I'd want two columns of the same type from the same query to be mapped to two different Python types. For handling exceptional circumstances, say e.g. you inherit a messed up database that stores timestamps as nanoseconds since the big bang that you automatically want to convert to a datetime object, I suggest standardizing the concept of row factory functions. In a nutshell, cursor objects would have an optional callable rowfactory attribute. If a rowfactory is specified, it will translate between what a fetch would normally return and what it should return instead. Let me know what you think. -Carsten From carsten at uniqsys.com Sat Apr 21 09:09:47 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 21 Apr 2007 03:09:47 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1177119242.3305.52.camel@localhost.localdomain> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> Message-ID: <1177139387.3208.6.camel@localhost.localdomain> Hi All, I have taken the time to write out my type mapping proposal in a slightly more structured form, with some hopefully enlightening examples of how this proposal might be useful. Please see http://www.uniqsys.com/~carsten/typemap.html Any comments are welcome, and I'll do my best to incorporate constructive criticism into future revisions of this proposal. -Carsten From mike_mp at zzzcomputing.com Sat Apr 21 16:52:31 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Sat, 21 Apr 2007 10:52:31 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1177139387.3208.6.camel@localhost.localdomain> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> Message-ID: On Apr 21, 2007, at 3:09 AM, Carsten Haese wrote: > Hi All, > > I have taken the time to write out my type mapping proposal in a > slightly more structured form, with some hopefully enlightening > examples > of how this proposal might be useful. > > Please see http://www.uniqsys.com/~carsten/typemap.html > > Any comments are welcome, and I'll do my best to incorporate > constructive criticism into future revisions of this proposal. > heres some thoughts: - using a class-level approach, i.e. SQLData, makes it inconvenient to establish custom types that are independent of a particular DBAPI, since the SQLData class itself like everything else in DBAPI only exists within implementations. its impossible to define the class until you've imported some DBAPI. SQLData's origins in JDBC dont have this issue since SQLData is part of the abstract JDBC api and classes can be built against it independently of any database driver being available. - because SQLData's state is the data itself, SQLData is not really a "type" at all, its a value object which includes its own database translation function. That greatly limits what kinds of types SQLData can be realistically used for, and in fact it can only be used for datatypes that are explicitly aware that they are being stored in a database - and only a specific DBAPI too. For example, its impossible to use SQLData to directly represent Decimal instances or datetime instances; neither of them subclass .SQLData. If the answer is that we'd just use typemaps for those, then what would we use SQLData for ? I can use a typemap for my SpatialData objects just as easily, without my SpatialData object being welded to a specific persistence scheme and specific DBAPI. Also because SQLData is not stateful with regards to its type, its not possible for a single SQLData class to represent variants of a particular type, such as strings that should be truncated to length 50 versus strings that are truncted to length 100; youd have to use more subclassing. - per-column typemaps: here is a common use case. I am receiving a row which contains two BLOB coluimns. one BLOB is image data representing a JPEG image, one BLOB is a pickled instance of a Python class. I would like to register type converters so that the second column is run through the pickle.loads() function but not the first. If we are registering various type-handling callables at the cursor level, it should be easy enough to add an optional integer parameter which will bind that type converter to only a specific column position in a result set. the use case is more obvious in the bind parameter direction. From carsten at uniqsys.com Sat Apr 21 18:18:01 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 21 Apr 2007 12:18:01 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> Message-ID: <1177172281.3191.79.camel@localhost.localdomain> On Sat, 2007-04-21 at 10:52 -0400, Michael Bayer wrote: > On Apr 21, 2007, at 3:09 AM, Carsten Haese wrote: > > > Hi All, > > > > I have taken the time to write out my type mapping proposal in a > > slightly more structured form, with some hopefully enlightening > > examples > > of how this proposal might be useful. > > > > Please see http://www.uniqsys.com/~carsten/typemap.html > > > > Any comments are welcome, and I'll do my best to incorporate > > constructive criticism into future revisions of this proposal. > > > > heres some thoughts: Thanks for taking the time to read the proposal. You are making some good points. If you don't want to read all my responses, skip to the summary at the bottom. > - using a class-level approach, i.e. SQLData, makes it inconvenient > to establish custom types that are independent of a particular DBAPI, > since the SQLData class itself like everything else in DBAPI only > exists within implementations. its impossible to define the class > until you've imported some DBAPI. SQLData's origins in JDBC dont > have this issue since SQLData is part of the abstract JDBC api and > classes can be built against it independently of any database driver > being available. It's not impossible. You could always mix-in the somedb.SQLData base class into a generic Python object after importing somedb. Also, since the point is for an application to define type mappings for a particular database, I don't see the limitation in making the SQLData class specific to the API module that the application will usually have imported already anyway. In my opinion, making the SQLData class specific to the API module is necessary since the ToDB method for translating a particular object to the database may differ from database to database. > - because SQLData's state is the data itself, SQLData is not really a > "type" at all, its a value object which includes its own database > translation function. That greatly limits what kinds of types > SQLData can be realistically used for, and in fact it can only be > used for datatypes that are explicitly aware that they are being > stored in a database - and only a specific DBAPI too. Isn't that the point of defining a bidirectional type mapping from/to the database? > For example, its impossible to use SQLData to directly represent > Decimal instances or datetime instances; neither of them subclass > .SQLData. True, but on one hand, datetime instances and Decimal instances should be handled by the API's canonical mapping already, and on the other hand the somedb API could always choose to return datetimes as objects that derive from both datetime and somedb.SQLData. > If the answer is that we'd just use typemaps > for those, then what would we use SQLData for ? I can use a typemap > for my SpatialData objects just as easily, without my SpatialData > object being welded to a specific persistence scheme and specific DBAPI. Of course you could, but "welding" the object to a specific DB-API is what allows the object to be passed transparently as an input parameter into queries against that specific database. A corollary of the principle of least surprise is that it should always be possible to take the result of a select query and insert that object into the column that it was read from. Inheriting from SQLData is what allows this seamless select-insert round-trip. Having said all that, I'm not married to the idea of requiring the application-side objects to derive from a particular SQLData class. For the purpose of input binding, it would be enough, and more in line with the idea of duck-typing, if the object provided an agreed-upon method, e.g. "ToDB" that the DB-API can call to translate between application type and canonical database type. Essentially, the proposed input translation could change to if hasattr(in_param, "ToDB"): in_param = in_param.ToDB() or something like that. It may be beneficial to allow this call to pass more parameters in order to tell the object something about the context in which the conversion is occurring, including but not limited to the name of the API module, the active connection/cursor, and, if available, the descriptor of the database-side column type and name that the object is destined for. > Also because SQLData is not stateful with regards to its type, its > not possible for a single SQLData class to represent variants of a > particular type, such as strings that should be truncated to length > 50 versus strings that are truncted to length 100; youd have to use > more subclassing. As proposed, SQLData *is* stateful with regards to the type. By default, it's not stateful with regards to subtype, length, and precision, but this can, and should, be added. If the constructor is given the complete cursor.description entry that goes along with the value, it has everything it needs to remember this information. > - per-column typemaps: here is a common use case. I am receiving a > row which contains two BLOB coluimns. one BLOB is image data > representing a JPEG image, one BLOB is a pickled instance of a Python > class. I would like to register type converters so that the second > column is run through the pickle.loads() function but not the > first. If we are registering various type-handling callables at the > cursor level, it should be easy enough to add an optional integer > parameter which will bind that type converter to only a specific > column position in a result set. the use case is more obvious in the > bind parameter direction. Yes, I already suggested this (passing the column number to the outbound adapter) as a possible extension. However, the use case is convincing enough that we should probably allow for a more convenient per-column mapping that allows dispatching the conversion to a different adapter callable altogether, rather than having to define one adapter that returns one thing or another depending on which column it's converter. To handle this, the cursor could grow a coltypemap attribute, which is a mapping of typemaps, keyed on the column number or, maybe more conveniently, column name. In summary, I am open to making the following revisions: * The SQLData class would become optional or be eliminated. Inbound type conversions between Python objects and the database will be performed by a well-defined ToDB method that the object may implement regardless of its inheritance tree. If an inbound Python object doesn't define a ToDB method, it'll be mapped by the canonical mapping for the particular database. * The outbound conversion call will receive additional parameters, such as the cursor.description tuple, that will allow the adapter to make the resulting object stateful with respect to all of its database type properties. * Add an optional coltypemap attribute to the cursor for defining a column-specific typemap. Unless I'm missing something, these revisions should address all the points you have brought up. -Carsten From mike_mp at zzzcomputing.com Sat Apr 21 19:33:04 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Sat, 21 Apr 2007 13:33:04 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1177172281.3191.79.camel@localhost.localdomain> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> Message-ID: <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> On Apr 21, 2007, at 12:18 PM, Carsten Haese wrote: > > Thanks for taking the time to read the proposal. You are making some > good points. If you don't want to read all my responses, skip to the > summary at the bottom. > of course ill read fully ! :) > > In my opinion, making the SQLData class specific to the API module is > necessary since the ToDB method for translating a particular object to > the database may differ from database to database. thats true, but there are also cases where it doesnt. the "PickleType" example is one where all we want to do is shuttle a binary stream to/from the pickle API. I also have an interest in creating code that is database portable....the connection objects dealt with are produced by a factory, where the actual DBAPI module is private to one particular implementation of that factory. > >> - because SQLData's state is the data itself, SQLData is not really a >> "type" at all, its a value object which includes its own database >> translation function. That greatly limits what kinds of types >> SQLData can be realistically used for, and in fact it can only be >> used for datatypes that are explicitly aware that they are being >> stored in a database - and only a specific DBAPI too. > > Isn't that the point of defining a bidirectional type mapping from/to > the database? the point of bidirectional type mapping is to define database persistence conversions for a particular application data type. but decoupling the application data type from a particular persistence strategy allows the persistence mappings to vary independently of the data type and the rest of the application on which it depends. > >> If the answer is that we'd just use typemaps >> for those, then what would we use SQLData for ? I can use a typemap >> for my SpatialData objects just as easily, without my SpatialData >> object being welded to a specific persistence scheme and specific >> DBAPI. > > Of course you could, but "welding" the object to a specific DB-API is > what allows the object to be passed transparently as an input > parameter > into queries against that specific database. > > A corollary of the principle of least surprise is that it should > always > be possible to take the result of a select query and insert that > object > into the column that it was read from. Inheriting from SQLData is what > allows this seamless select-insert round-trip. Well, in this case, SpatialData is related to its persistence implementation through subclassing. but its just as easy for SpatialData to be related to its persistence implementation via assocation without the subclassing requirement. SQLAlchemy provides full-round trip capability of any type you want and uses the same mechanism for all types, including at the column level, without any different treatment of "Python" types and user-defined types. On both sides of the trip, all thats required is a dictionary that maps TypeEngine subclasses (which purely define database translation strategies) to either bind param names/positions and/or result column names/positions. mapping to DBAPI/python types is just one more way of doing that (maybe I should look into adding that dimension to SA's implementation...) > > Essentially, the proposed input translation could change to > > if hasattr(in_param, "ToDB"): > in_param = in_param.ToDB() > OK, duck typing is much better and more analgous to JDBC's usage of an interface. this solves the module-importing issue, but not necessarily the "different db's might require different ToDB() implementations" problem - it still binds my application-level value objects to an assumption about their storage...and if my application suddenly had to support two different databases, or even to persist the same collections of objects in both of those DBs (there are definitely apps that do this), now my program design has to create copies of values to handle the discrepancy. the same issue exists for an application value that is stored in multiple places within the same database, but in different ways; such as a Date type that is stored both in some legacy table with a particuilar string-format style of storage and some newer table with a decimal-based storage format (or a different string format). a behind-the-scenes registry of converters mapped to my application's types solves the multiple-databases problem, and bind/column-mapped converters solve the multiple-tables problem. the non-class-bound approach, using registered converters, looks like: converter = cursor.type_mappings.get(type(in_param), None) if converter is not None: in_param = converter.ToDB(in_param) that removes all interface responsibilities from in_param's class. However, I can see the value in the presence of ToDB() (and FromDB() classmethods perhaps) being useful from strictly a convenience point of view. that is, in the common use case that the persistence of a particular kind of object has no complex requirements. but im not sure if DBAPI itself should present both a generalized method as well as a "convenience/80% case" method (of which ToDB() is the latter). If I wanted a SQLData-like class in my own application, I could easy enough create a metaclass approach that automatically registers the object's type-conversion methods using the generic typing system. > >> - per-column typemaps: here is a common use case. I am receiving a >> row which contains two BLOB coluimns. one BLOB is image data >> representing a JPEG image, one BLOB is a pickled instance of a Python >> class. I would like to register type converters so that the second >> column is run through the pickle.loads() function but not the >> first. If we are registering various type-handling callables at the >> cursor level, it should be easy enough to add an optional integer >> parameter which will bind that type converter to only a specific >> column position in a result set. the use case is more obvious in the >> bind parameter direction. > > Yes, I already suggested this (passing the column number to the > outbound > adapter) as a possible extension. However, the use case is convincing > enough that we should probably allow for a more convenient per-column > mapping that allows dispatching the conversion to a different adapter > callable altogether, rather than having to define one adapter that > returns one thing or another depending on which column it's converter. > > To handle this, the cursor could grow a coltypemap attribute, which > is a > mapping of typemaps, keyed on the column number or, maybe more > conveniently, column name. probably both. > > In summary, I am open to making the following revisions: > * The SQLData class would become optional or be eliminated. Inbound > type > conversions between Python objects and the database will be > performed by > a well-defined ToDB method that the object may implement regardless of > its inheritance tree. If an inbound Python object doesn't define a > ToDB > method, it'll be mapped by the canonical mapping for the particular > database. yeah thats more or less what i was saying above. > * The outbound conversion call will receive additional parameters, > such > as the cursor.description tuple, that will allow the adapter to > make the > resulting object stateful with respect to all of its database type > properties. its possible that cursor.description doesnt have all the information we need; such as, a string column that represents dates, and we need to decide what string format is represented in the column. > * Add an optional coltypemap attribute to the cursor for defining a > column-specific typemap. yeah, just having various maps of typing information to me seems to represent the one method that is of general use for all cases. From carsten at uniqsys.com Sat Apr 21 22:10:00 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 21 Apr 2007 16:10:00 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> Message-ID: <1177186201.3367.43.camel@localhost.localdomain> On Sat, 2007-04-21 at 13:33 -0400, Michael Bayer wrote: > On Apr 21, 2007, at 12:18 PM, Carsten Haese wrote: > > Essentially, the proposed input translation could change to > > > > if hasattr(in_param, "ToDB"): > > in_param = in_param.ToDB() > > > > OK, duck typing is much better and more analgous to JDBC's usage of > an interface. this solves the module-importing issue, but not > necessarily the "different db's might require different ToDB() > implementations" problem - it still binds my application-level value > objects to an assumption about their storage...and if my application > suddenly had to support two different databases, or even to persist > the same collections of objects in both of those DBs (there are > definitely apps that do this), now my program design has to create > copies of values to handle the discrepancy. the same issue exists > for an application value that is stored in multiple places within the > same database, but in different ways; such as a Date type that is > stored both in some legacy table with a particuilar string-format > style of storage and some newer table with a decimal-based storage > format (or a different string format). > > a behind-the-scenes registry of converters mapped to my application's > types solves the multiple-databases problem, and bind/column-mapped > converters solve the multiple-tables problem. > > the non-class-bound approach, using registered converters, looks like: > > converter = cursor.type_mappings.get(type(in_param), None) > if converter is not None: > in_param = converter.ToDB(in_param) > > that removes all interface responsibilities from in_param's class. Okay, here we have reached the heart of the matter: Persisting an application object in a database requires cooperation between the object and the database. Either the object needs to know about the database, or the database needs to know about the object. The former can be done by the object having a ToDB method that is given information about how the database that it'll be stored in, and react appropriately. The latter can be done in the way you propose, using an inbound typemap. I'll concede that using an inbound typemap has a beautiful symmetry to using an outbound typemap, and it's way less kludgy than making the object aware of every single database that might want to store it. However, the adapter lookup needs to be done in a way that doesn't suddenly fail if the application object is subclassed! Doing this is just a bit more involved: for tp in type(in_param).__mro__: converter = cursor.input_typemap.get(tp, None) if converter is not None: break if converter is not None: in_param = converter(in_param) Note that it's enough if the converter is simply any callable object that returns the converted object. > > To handle this [column specific mapping], the cursor could grow a coltypemap attribute, which > > is a > > mapping of typemaps, keyed on the column number or, maybe more > > conveniently, column name. > > probably both. Yeah, I actually meant both :) > > > > In summary, I am open to making the following revisions: > > * The SQLData class would become optional or be eliminated. Inbound > > type > > conversions between Python objects and the database will be > > performed by > > a well-defined ToDB method that the object may implement regardless of > > its inheritance tree. If an inbound Python object doesn't define a > > ToDB > > method, it'll be mapped by the canonical mapping for the particular > > database. > > yeah thats more or less what i was saying above. In the meantime you've made me see the light that the SQLData base class and the ToDB interface can be completely eliminated if we use an inbound typemap for handling the translation from the application to the database. In light of this development, I propose the following changes to my proposal: * The SQLData class and the ToDB interface will be eliminated. * The typemap attribute will be renamed to output_typemap. * An analogous input_typemap will be added. > > * The outbound conversion call will receive additional parameters, > > such > > as the cursor.description tuple, that will allow the adapter to > > make the > > resulting object stateful with respect to all of its database type > > properties. > > its possible that cursor.description doesnt have all the information > we need; such as, a string column that represents dates, and we need > to decide what string format is represented in the column. And who or what, other than the programmer who can handle the situation with a column-specific typemap, *would* have all the information that's needed in that case? > > * Add an optional coltypemap attribute to the cursor for defining a > > column-specific typemap. > > yeah, just having various maps of typing information to me seems to > represent the one method that is of general use for all cases. I'm glad we're beginning to agree. Maybe down this road, consensus can be found. -Carsten From unixdude at gmail.com Sun Apr 22 21:03:34 2007 From: unixdude at gmail.com (Jim Patterson) Date: Sun, 22 Apr 2007 15:03:34 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> References: <4624826E.5080703@egenix.com> <1176798574.3630.26.camel@mila> <4624DEFA.30501@research.att.com> <10951CF0-543F-46B5-95EE-FBCE494BDCE3@zzzcomputing.com> <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> Message-ID: (Resend since I hit reply instead or reply-all) Wow, I'm really glad to see this topic has garnered such great response from everyone. This make me really hopeful that we can solve this issue. On 4/20/07, Anthony Tuininga wrote: > > I've been following this thread and it would appear that no real > consensus has been reached as of yet. I've looked at the api used by > sqlite and knowing its storage and type definition system it makes > good sense. I am considering adding the following to cx_Oracle, > following some of the examples given so far with modifications needed > for Oracle, and I'd appreciate any input you might have. > > cursor.setdefaulttype(databaseType, type) > connection.setdefaulttype(databaseType, type) I like the sounds of this. What are you thinking the databaseType parameter? I can see using the standard database types that the modules expose (STRING, BINARY, NUMBER, DATETIME, and ROWID) along with extended types that are custom to the database API. If the extended types were derived from the standard types, then a bit of code that was portable could use STRING to say that all strings map to Unicode or whatever, and code that used an advanced feature of a database could map NVARCHAR to Unicode and map VARCHAR to non-Unicode. registeradapter(type, databaseType, fromPythonMethod, toPythonMethod) > > This would specify that whenver an object of the given type is bound > to a cursor, that the fromPythonMethod method would be invoked with > the value and would expect a return value that can be directly bound > to the databaseType. The toPythonMethod method would be invoked when > columns are retrieved and would accept the databaseType value and > expect back a value of the given type. Is this mapped as a tuple of type and databaseType? Or is this mapping saying that to get to/from type you used the database type on the db side and the correct function based on direction? Some help on the names would be appreciated as well -- its the worst > part of programming. :-) I've tried to use the DB API style of naming > -- all lower case without any underscores even though it isn't my > personal favorite. I kind of like Robert's suggestions below about using an exposed mapping object and being explicit about the direction of the conversion. So I would suggest names like "defaulttypefromdb" and "defaulttypetodb" (again using the existing db api naming style) Jim Patterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/c3c5ed96/attachment.html From mike_mp at zzzcomputing.com Sun Apr 22 22:11:55 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Sun, 22 Apr 2007 16:11:55 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <1176840738.3735.18.camel@mila> <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> Message-ID: (assuming this is meant for on-list) On Apr 22, 2007, at 3:42 PM, Jim Patterson wrote: > For the most part I have been thinking about simple type mappings, > but some > of the examples raised discuss this kind of greater flexibility. I > had been > mostly thinking about the problem of dealing with the database > specific > problem of mapping the database type to the Python universe. A number > can come back as an int/long, or a float, or a decimal, or a > complex number. > How that is accomplished is very specific to the database. > > If I'm following the capability you are talking about with dates in > strings > and pickled classes and jpegs then that should seem to me to be a > layer on top of database specific part. I see the hierarchy as: > > advanced type conversion library > python dbapi module > database provided api > database > > It would seem that if we get enough power and flexibility in the > dbapi type specifies then the "advanced type conversion library" > can be common code that does not care what type of dbapi it > is sitting on. To handle jpegs or pickled classes it needs to be > able to tell the dbapi that it wants to use BINARY objects. To > handle the dates in strings it needs to be able to tell the dbapi > that it wants to use strings. currently, you cant exactly write the "advanced type conversion library" in a totally dbapi-neutral way, because you cant always be sure that a DBAPI supports the native types used by a particular "advanced conversion" type. in particular I mention dates because SQLite/pysqlite has no date type - you can *only* get a datetime object to/from a sqlite database using string formatting of some kind. so if datestring parsing is part of a "layer on top of DBAPI", in the case of sqlite you need to use this layer, in the case of most other databases you dont. another example would be an "advanced" type that relies upon array values. lots of folks seem to like using Postgres' array type, a type which is not available in other DBs. so such a type which depends on underlying arrarys would also need to vary its implementation depending on DBAPI. Not that converting from binary->picklestream isnt something that should be performed externally to DBAPI...but because of the variance in available type support its hard to draw a crisp line between whats "on top" of DBAPI and whats not, which is why with dates in particular I put them in the "native" category, if for no other reason than sqlite's non-support of them (well, and also that dates are pretty darn important). SQLAlchemy also expresses the "native type"/"advanced type" dichotomy explicitly. For things like dates (which are non-standard to sqlite), binary objects (which return a specialized LOB object on oracle that is normalized to act like the other DBAPIs), numbers (which are returned as Decimal in postgres, floats in all others), SA implements whole modules of different TypeEngine implementations tailored to each supported DBAPI - these types form the "lower level" set of types. The "translation on top of a type" operation is handled by subclasses of TypeDecorator, which references a TypeEngine (the lower level type base class) compositionally - currently PickleType is the only standard type within this second hierarchy. Other folks have also implemented Enums in this layer (which ironically is a native type in mysql). So I guess the reason i conflate the "native"/"advanced" types is because from DBAPI to DBAPI theres no clear line as to what category a particular kind of type falls into. From unixdude at gmail.com Sun Apr 22 22:19:40 2007 From: unixdude at gmail.com (Jim Patterson) Date: Sun, 22 Apr 2007 16:19:40 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1177186201.3367.43.camel@localhost.localdomain> References: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <1177186201.3367.43.camel@localhost.localdomain> Message-ID: On 4/21/07, Carsten Haese wrote: > > In light of this development, I propose the following changes to my > proposal: > * The SQLData class and the ToDB interface will be eliminated. > * The typemap attribute will be renamed to output_typemap. > * An analogous input_typemap will be added. > I'm not a big fan of the terms input and output for a case like this they can be confusing. For example does in mean into Python or into the database. So I would prefer more explicit terms like "fromdb" or "topython" (I would pick the to/from db since it is shorter and just as clear). Jim Patterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/d0fbdea0/attachment.html From carsten at uniqsys.com Sun Apr 22 22:31:06 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sun, 22 Apr 2007 16:31:06 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <6B72C802-A92C-4B2C-9682-1F581FDD6A08@zzzcomputing.com> <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <1177186201.3367.43.camel@localhost.localdomain> Message-ID: <1177273866.3236.5.camel@localhost.localdomain> On Sun, 2007-04-22 at 16:19 -0400, Jim Patterson wrote: > On 4/21/07, Carsten Haese wrote: > In light of this development, I propose the following changes > to my > proposal: > * The SQLData class and the ToDB interface will be eliminated. > * The typemap attribute will be renamed to output_typemap. > * An analogous input_typemap will be added. > > I'm not a big fan of the terms input and output for a case like this > they can be confusing. For example does in mean into Python > or into the database. So I would prefer more explicit terms like > "fromdb" or "topython" (I would pick the to/from db since it is > shorter and just as clear). The wording was intended to be from the point of view of the database, but I agree that it would eliminate confusion to make the direction explicitly clear. I'm fine with fromdb_typemap and todb_typemap instead of output_typemap and input_typemap, respectively. -Carsten From unixdude at gmail.com Mon Apr 23 04:14:19 2007 From: unixdude at gmail.com (Jim Patterson) Date: Sun, 22 Apr 2007 22:14:19 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> Message-ID: On 4/22/07, Michael Bayer wrote: > > (assuming this is meant for on-list) Yes, absolutely. My mistake, I'm used to lists that place themselves as the reply-to, and I did not check the to and cc lists. currently, you cant exactly write the "advanced type conversion > library" in a totally dbapi-neutral way, because you cant always be > sure that a DBAPI supports the native types used by a particular > "advanced conversion" type. in particular I mention dates because > SQLite/pysqlite has no date type - you can *only* get a datetime > object to/from a sqlite database using string formatting of some > kind. so if datestring parsing is part of a "layer on top of DBAPI", > in the case of sqlite you need to use this layer, in the case of most > other databases you dont. I've never used SQLIte so I can only comment based on a quick read of some of the docs, but my thinking is that maybe the SQLite dbapi should be extended to provide a DATETIME type for use by the Python programmer, and them map in into some canonical string in the SQLite apis. It looks like some of that support already exists with the PARSE_DECLTYPES and PARSE_COLTYPES, but it seems to be missing the STRING, BINARY, NUMBER, DATETIME, and ROWID types listed in PEP 249. I'm not fully sure that is needs them, but I know I've written code to the DB API that will not work with SQLite since it does not have them. another example would be an "advanced" type that relies upon array > values. lots of folks seem to like using Postgres' array type, a > type which is not available in other DBs. so such a type which > depends on underlying arrarys would also need to vary its > implementation depending on DBAPI. Given that arrays are not supported very well across databases, I'm not sure that you can write portable code that uses them. Maybe we can define a set of types the must be supported and set of types that are optional and then by checking to see (at runtime) if the module exposes that type this mythical "advanced" library could adjust itself. Not that converting from binary->picklestream isnt something that > should be performed externally to DBAPI...but because of the variance > in available type support its hard to draw a crisp line between whats > "on top" of DBAPI and whats not, which is why with dates in > particular I put them in the "native" category, if for no other > reason than sqlite's non-support of them (well, and also that dates > are pretty darn important). I look at them that way as well, but at least initially because PEP 249 listed them as supported and because all the databases I have used (a small set of the total that exist) all support it. SQLAlchemy also expresses the "native type"/"advanced type" dichotomy > explicitly. For things like dates (which are non-standard to > sqlite), binary objects (which return a specialized LOB object on > oracle that is normalized to act like the other DBAPIs), numbers > (which are returned as Decimal in postgres, floats in all others), SA > implements whole modules of different TypeEngine implementations > tailored to each supported DBAPI - these types form the "lower level" > set of types. The "translation on top of a type" operation is > handled by subclasses of TypeDecorator, which references a TypeEngine > (the lower level type base class) compositionally - currently > PickleType is the only standard type within this second hierarchy. > Other folks have also implemented Enums in this layer (which > ironically is a native type in mysql). I'm just hoping we can simplify some of this kind of stuff by put more of it at the DBAPI level. As you mentioned the real question becomes where do you draw the line. It is a tough question. I got started on this very topic since I wanted to draw the line in a place other than where cx_Oracle had drawn that line in the past. It seemed to me that Unicode support belonged in the DBAPI since it is somewhat hard to get right with Oracle and the solution is VERY Oracle specific. Setting the NLS_LANG environment variable wrong gets you no or incorrect Unicode support. I was also wanting Decimal support since for me, I'm doing work with money and floating point approximations of money is a really scary thing. I could have used the interface that cx_Oracle supplied to always get numbers as strings and then done the conversion myself, but I was nervous that someone on my team would forget and it would cause problems. Anthony was very willing to work with me to add support for Unicode and Decimal so for me it was any easy redraw of the line (Anthony was already planning the Unicode) So I guess the reason i conflate the "native"/"advanced" types is > because from DBAPI to DBAPI theres no clear line as to what category > a particular kind of type falls into. There seems to be as much confusion within the databases themselves, so the best w may be able to do is broad support for the common types and a way to tell if the module supports the other types. In order to write these advanced converters in a portable way across which ever set of DBAPIs support the required type we will still have to be able to tell the DBAPI that we need the data in a standard python datatype so that it can be passed around. That might be a good starting point for above/below the line. If we look at the built-in type in the Python library reference we can get a list of the python types that a developer might want to use. Some can be narrowed down to a single option. You most likely do not need iterators for example. Jim Patterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070422/0b7804a5/attachment-0001.html