From phd at phd.pp.ru  Thu Jan 10 13:27:14 2008
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 10 Jan 2008 15:27:14 +0300
Subject: [DB-SIG] SQLObject 0.7.10
Message-ID: <20080110122714.GC3070@phd.pp.ru>

Hello!

I'm pleased to announce the 0.7.10 release of SQLObject.

What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.7.10

News and changes:
http://sqlobject.org/docs/News.html


What's New
==========

News since 0.7.9
----------------

* With PySQLite2 do not use encode()/decode() from PySQLite1 - always use
  base64 for BLOBs.

* MySQLConnection doesn't convert query strings to unicode (but allows to
  pass unicode query strings if the user build ones). DB URI parameter
  sqlobject_encoding is no longer used.

For a more complete list, please see the news:
http://sqlobject.org/docs/News.html


Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From phd at phd.pp.ru  Thu Jan 10 13:32:54 2008
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 10 Jan 2008 15:32:54 +0300
Subject: [DB-SIG] SQLObject 0.8.7
Message-ID: <20080110123254.GG3070@phd.pp.ru>

Hello!

I'm pleased to announce the 0.8.7 release of SQLObject.


What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Development:
http://sqlobject.org/devel/

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.8.7

News and changes:
http://sqlobject.org/News.html


What's New
==========

News since 0.8.6
----------------

* With PySQLite2 do not use encode()/decode() from PySQLite1 - always use
  base64 for BLOBs.

* MySQLConnection doesn't convert query strings to unicode (but allows to
  pass unicode query strings if the user build ones). DB URI parameter
  sqlobject_encoding is no longer used.

For a more complete list, please see the news:
http://sqlobject.org/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From phd at phd.pp.ru  Thu Jan 10 13:38:25 2008
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 10 Jan 2008 15:38:25 +0300
Subject: [DB-SIG] SQLObject 0.9.3
Message-ID: <20080110123825.GK3070@phd.pp.ru>

Hello!

I'm pleased to announce the 0.9.3 release of SQLObject.


What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Development:
http://sqlobject.org/devel/

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.9.3

News and changes:
http://sqlobject.org/News.html


What's New
==========

Bug Fixes
~~~~~~~~~

* With PySQLite2 do not use encode()/decode() from PySQLite1 - always use
  base64 for BLOBs.

* MySQLConnection doesn't convert query strings to unicode (but allows to
  pass unicode query strings if the user build ones). DB URI parameter
  sqlobject_encoding is no longer used.

For a more complete list, please see the news:
http://sqlobject.org/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From omranju at gmail.com  Fri Jan 11 14:51:50 2008
From: omranju at gmail.com (OM Ranju)
Date: Fri, 11 Jan 2008 20:51:50 +0700
Subject: [DB-SIG] Requests
Message-ID: <f19157d0801110551g51c94f6btfd027f815478cbdf@mail.gmail.com>

Kindly clear me about dictionary deepily
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20080111/60498e8d/attachment.htm 

From carsten at uniqsys.com  Fri Jan 11 15:13:44 2008
From: carsten at uniqsys.com (Carsten Haese)
Date: Fri, 11 Jan 2008 09:13:44 -0500
Subject: [DB-SIG] Requests
In-Reply-To: <f19157d0801110551g51c94f6btfd027f815478cbdf@mail.gmail.com>
References: <f19157d0801110551g51c94f6btfd027f815478cbdf@mail.gmail.com>
Message-ID: <1200060824.3433.10.camel@dot.uniqsys.com>

On Fri, 2008-01-11 at 20:51 +0700, OM Ranju wrote:
> Kindly clear me about dictionary deepily

This is not understandable as an English sentence. I am guessing that
you have a question about a dictionary, but it's not obvious what it is
you wish to know or even whether you are asking about a dictionary as a
Python data structure or about a dictionary as a list of word
definitions/translations.

Please rephrase your question and provide more detail about what you
need to know, and maybe then we'll be able to help you.

-- 
Carsten Haese
http://informixdb.sourceforge.net


From phd at phd.pp.ru  Fri Jan 11 16:24:15 2008
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Fri, 11 Jan 2008 18:24:15 +0300
Subject: [DB-SIG] SQLObject 0.10.0b1
Message-ID: <20080111152415.GG26551@phd.pp.ru>

Hello!

I'm pleased to announce the 0.10.0b1, the first beta release of a new
SQLObject branch, 0.10.


What is SQLObject
=================

SQLObject is an object-relational mapper.  Your database tables are described
as classes, and rows are instances of those classes.  SQLObject is meant to be
easy to use and quick to get started with.

SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and
Firebird.  It also has newly added support for Sybase, MSSQL and MaxDB (also
known as SAPDB).


Where is SQLObject
==================

Site:
http://sqlobject.org

Development:
http://sqlobject.org/devel/

Mailing list:
https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss

Archives:
http://news.gmane.org/gmane.comp.python.sqlobject

Download:
http://cheeseshop.python.org/pypi/SQLObject/0.10.0b1

News and changes:
http://sqlobject.org/News.html


What's New
==========

Features & Interface
--------------------

* Dropped support for Python 2.2. The minimal version of Python for
  SQLObject is 2.3 now.

* Removed actively deprecated attributes;
  lowered deprecation level for other attributes to be removed after 0.10.

* SQLBuilder Select supports the rest of SelectResults options (reversed,
  distinct, joins, etc.)

* SQLObject.select() (i.e., SelectResults) and DBConnection.queryForSelect()
  use SQLBuilder Select queries; this make all SELECTs implemented
  internally via a single mechanism.

* SQLBuilder Joins handle SQLExpression tables (not just str/SQLObject/Alias)
  and properly sqlrepr.

* SQLBuilder tablesUsedDict handles sqlrepr'able objects.

* Added SQLBuilder ImportProxy. It allows one to ignore the circular import
  issues with referring to SQLObject classes in other files - it uses the
  classregistry as the string class names for FK/Joins do, but specifically
  intended for SQLBuilder expressions. See
  tests/test_sqlbuilder_importproxy.py.

* Added SelectResults.throughTo. It allows one to traverse relationships
  (FK/Join) via SQL, avoiding the intermediate objects. Additionally, it's
  a simple mechanism for pre-caching/eager-loading of later FK
  relationships (i.e., going to loop over a select of somePeople and ask
  for aPerson.group, first call list(somePeople.throughTo.group) to preload
  those related groups and use 2 db queries instead of N+1). See
  tests/test_select_through.py.

* Added ViewSQLObject.

* Added sqlmeta.getColumns() to get all the columns for a class (including
  parent classes), excluding the column 'childName' and including the column
  'id'. sqlmeta.asDict() now uses getColumns(), so there is no need to
  override it in the inheritable sqlmeta class; this makes asDict() to work
  properly on inheritable sqlobjects.

* Changed the implementation type in BoolCol under SQLite from TINYINT to
  BOOLEAN and made fromDatabase machinery to recognize it.

* Added rich comparison methods; SQLObjects of the same class are
  considered equal is they have the same id; other methods return
  NotImplemented.

* MySQLConnection (and DB URI) accept a number of SSL-related parameters:
  ssl_key, ssl_cert, ssl_ca, ssl_capath.

For a more complete list, please see the news:
http://sqlobject.org/News.html

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From mwm at mired.org  Fri Jan 11 17:35:33 2008
From: mwm at mired.org (Mike Meyer)
Date: Fri, 11 Jan 2008 11:35:33 -0500
Subject: [DB-SIG] Handling an open database connection after a fork?
Message-ID: <20080111113533.7ba8fc47@mbook.mired.org>

I have an application that's using oracle (via cx_Oracle) to log
events (among other things). It runs in multiple processes, forking
new processes as it needs them.

I.e.

db = cx_Oracle.connect(.....)
cu = db.cursor()

[do various things, including sql inserts and commits]

if fork():
   # Parent wants to keep the existing database connection.
else:
   # Child wants a database connection.

So the question is - what should the child do to get a database
connection? Can it just keep using the existing db & cu variables? If
not, does it need to do anything special, or avoid doing anything, in
order to not disrupt the parent processes use of those variables?

      thanx,
      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

From mal at egenix.com  Sat Jan 12 14:14:03 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 12 Jan 2008 14:14:03 +0100
Subject: [DB-SIG] Handling an open database connection after a fork?
In-Reply-To: <20080111113533.7ba8fc47@mbook.mired.org>
References: <20080111113533.7ba8fc47@mbook.mired.org>
Message-ID: <4788BD1B.4050103@egenix.com>

On 2008-01-11 17:35, Mike Meyer wrote:
> I have an application that's using oracle (via cx_Oracle) to log
> events (among other things). It runs in multiple processes, forking
> new processes as it needs them.
> 
> I.e.
> 
> db = cx_Oracle.connect(.....)
> cu = db.cursor()
> 
> [do various things, including sql inserts and commits]
> 
> if fork():
>    # Parent wants to keep the existing database connection.
> else:
>    # Child wants a database connection.
> 
> So the question is - what should the child do to get a database
> connection? Can it just keep using the existing db & cu variables? If
> not, does it need to do anything special, or avoid doing anything, in
> order to not disrupt the parent processes use of those variables?

That depends on the database module you're using.

It may be enough to close all connections and reopen them
in the fork. In other cases, you need to reload the database
module as well (e.g. if the module sets up a work environment
that holds caches, etc.).

In general, it's better to avoid all this and only load the module
for the first time after the fork (both in the parent and child
processes).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 12 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mwm-keyword-dbsig.588a7d at mired.org  Sat Jan 12 21:27:45 2008
From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer)
Date: Sat, 12 Jan 2008 15:27:45 -0500
Subject: [DB-SIG] Handling an open database connection after a fork?
In-Reply-To: <4788BD1B.4050103@egenix.com>
References: <20080111113533.7ba8fc47@mbook.mired.org>
	<4788BD1B.4050103@egenix.com>
Message-ID: <20080112152745.3c80edda@bhuda.mired.org>

On Sat, 12 Jan 2008 14:14:03 +0100 "M.-A. Lemburg" <mal at egenix.com> wrote:

> On 2008-01-11 17:35, Mike Meyer wrote:
> > I have an application that's using oracle (via cx_Oracle) to log
> > events (among other things). It runs in multiple processes, forking
> > new processes as it needs them.
> > 
> > I.e.
> > 
> > db = cx_Oracle.connect(.....)
> > cu = db.cursor()
> > 
> > [do various things, including sql inserts and commits]
> > 
> > if fork():
> >    # Parent wants to keep the existing database connection.
> > else:
> >    # Child wants a database connection.
> > 
> > So the question is - what should the child do to get a database
> > connection? Can it just keep using the existing db & cu variables? If
> > not, does it need to do anything special, or avoid doing anything, in
> > order to not disrupt the parent processes use of those variables?
> 
> That depends on the database module you're using.

As stated, cx_Oracle.

> In general, it's better to avoid all this and only load the module
> for the first time after the fork (both in the parent and child
> processes).

Not possible. Which is why I need to find out what to do to make
oracle (via cx_Oracle) happy.

       <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

From Dieter.Maurer at Haufe.de  Sun Jan 13 13:10:48 2008
From: Dieter.Maurer at Haufe.de (Dieter.Maurer at Haufe.de)
Date: Sun, 13 Jan 2008 13:10:48 +0100
Subject: [DB-SIG] Handling an open database connection after a fork?
In-Reply-To: <20080111113533.7ba8fc47@mbook.mired.org>
References: <20080111113533.7ba8fc47@mbook.mired.org>
Message-ID: <18313.65480.265082.886057@gargle.gargle.HOWL>

Mike Meyer wrote at 2008-1-11 11:35 -0500:
> ... existing connection in forked children ...
>So the question is - what should the child do to get a database
>connection? Can it just keep using the existing db & cu variables?

This is very unlikely.

I have had severe problems with different systems (ZODB connections,
LDAP connections). Not with Oracle connections, but probably only
because I do not use Oracle.

When the child is forked, it inherits the connections from the
parent -- but the protocols usually do not expect that several
processes (parent and child) are using them asynchronously.

In a single process, locks are often used to synchronize
access to a shared connection from different processes -- but
normal locks do not work across different processes -- and shared
memory semaphores are not that often used.

>If
>not, does it need to do anything special, or avoid doing anything, in
>order to not disrupt the parent processes use of those variables?

Open a new connection in your forked child.

It is not garanteed that this is sufficient.
For the ZODB, I have to take additional precautions.

I finally abondoned this approach completely (because, LDAP
was used deeply in my system and I had no control over the creation
of new connections) and am now using "fork+exec".


-- 
Viele Gr??e
Dieter

Tel: 06881-7327 (Festnetz) oder 06881-5590036 (Internet)

From mwm-keyword-dbsig.588a7d at mired.org  Sun Jan 13 20:07:09 2008
From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer)
Date: Sun, 13 Jan 2008 14:07:09 -0500
Subject: [DB-SIG] Handling an open database connection after a fork?
In-Reply-To: <18313.65480.265082.886057@gargle.gargle.HOWL>
References: <20080111113533.7ba8fc47@mbook.mired.org>
	<18313.65480.265082.886057@gargle.gargle.HOWL>
Message-ID: <20080113140709.72759d28@bhuda.mired.org>

On Sun, 13 Jan 2008 13:10:48 +0100 Dieter.Maurer at Haufe.de wrote:

> Mike Meyer wrote at 2008-1-11 11:35 -0500:
> > ... existing connection in forked children ...
> >So the question is - what should the child do to get a database
> >connection? Can it just keep using the existing db & cu variables?
> This is very unlikely.

That's what I expected.

> When the child is forked, it inherits the connections from the
> parent -- but the protocols usually do not expect that several
> processes (parent and child) are using them asynchronously.

Right. The question is, what's the right way to handle the connection
on the child side of things?

> >If
> >not, does it need to do anything special, or avoid doing anything, in
> >order to not disrupt the parent processes use of those variables?
> Open a new connection in your forked child.

Obvious. What do I do with the old one? I started out by explicitly
closing it, but that seems to make oracle unhappy (internal errors of
various kinds).

> I finally abondoned this approach completely (because, LDAP
> was used deeply in my system and I had no control over the creation
> of new connections) and am now using "fork+exec".

Oddly enough, fork+exec doesn't make the problem go away, just
provides another possible solution. Open fd's can either be closed on
exec, or not. Hopefully, it's closed because the python objects that
referred to it are lost across the exec. I'm willing to believe that
should work. So how do I simulate what happens on exec without
actually doing the exec?

     Thanks,
     <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

From omranju at gmail.com  Mon Jan 14 14:57:51 2008
From: omranju at gmail.com (OM Ranju)
Date: Mon, 14 Jan 2008 20:57:51 +0700
Subject: [DB-SIG] Requests
Message-ID: <f19157d0801140557t5404cf1fjcf1b0d9a2458eb03@mail.gmail.com>

Respected Sir,
                     How can i give printer settins to the customer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20080114/9b12d815/attachment.htm 

From mal at egenix.com  Mon Jan 14 16:38:30 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jan 2008 16:38:30 +0100
Subject: [DB-SIG] Requests
In-Reply-To: <f19157d0801140557t5404cf1fjcf1b0d9a2458eb03@mail.gmail.com>
References: <f19157d0801140557t5404cf1fjcf1b0d9a2458eb03@mail.gmail.com>
Message-ID: <478B81F6.6080805@egenix.com>

On 2008-01-14 14:57, OM Ranju wrote:
> Respected Sir,
>                      How can i give printer settins to the customer

This list is about Python & databases, not printers.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 14 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From dieter at handshake.de  Mon Jan 14 19:34:44 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Mon, 14 Jan 2008 19:34:44 +0100
Subject: [DB-SIG] Handling an open database connection after a fork?
In-Reply-To: <20080113140709.72759d28@bhuda.mired.org>
References: <20080111113533.7ba8fc47@mbook.mired.org>
	<18313.65480.265082.886057@gargle.gargle.HOWL>
	<20080113140709.72759d28@bhuda.mired.org>
Message-ID: <18315.43844.657101.933850@gargle.gargle.HOWL>

Mike Meyer wrote at 2008-1-13 14:07 -0500:
> ...
>On Sun, 13 Jan 2008 13:10:48 +0100 Dieter.Maurer at Haufe.de wrote:
> ...
>> >If
>> >not, does it need to do anything special, or avoid doing anything, in
>> >order to not disrupt the parent processes use of those variables?
>> Open a new connection in your forked child.
>
>Obvious. What do I do with the old one? I started out by explicitly
>closing it, but that seems to make oracle unhappy (internal errors of
>various kinds).

Your best bet is to leave it alone.

If you are lucky (!) then this will be sufficient.
As I mentioned, for a ZODB it was not sufficient --
because the child intercepted
messages destined for the parent and eat them away.

If you face similar problems, give up and "exec" in the forked
process.

>> I finally abondoned this approach completely (because, LDAP
>> was used deeply in my system and I had no control over the creation
>> of new connections) and am now using "fork+exec".
>
>Oddly enough, fork+exec doesn't make the problem go away, just
>provides another possible solution.

Maybe, you state precisely what problem you have.

Usually, it is not a problem that the execed child has some
open fds that it does not need.

When it is, you can explicitely close all connections, e.g.
before you exec.

>Open fd's can either be closed on
>exec, or not. Hopefully, it's closed because the python objects that
>referred to it are lost across the exec.

No, they remain open (as the complete memory state is replaces --
there is no way, Python can intercept the "exec").

>I'm willing to believe that
>should work. So how do I simulate what happens on exec without
>actually doing the exec?

You do the "exec" -- you cannot similate it (unless you are
using deep and very low level system magic, not directly supported by
Python).


-- 
Dieter

From james at jamesh.id.au  Fri Jan 18 09:37:46 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 17:37:46 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
Message-ID: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>

Hello,

I've been looking at implementing two phase commit for the psycopg2
driver for PostgreSQL.  It was suggested that I bring up the issue on
this list to see if there were any suggestions about what form the API
should take.

The API from the initial patch I produced stuck pretty close to the
PostgreSQL API, adding three methods to the connection object:

prepare_transaction(xid) - prepare the transaction, using the given
ID.  This closes off the transaction, allowing a new one to be started
(if needed).

commit_prepared(xid) - commit a previously prepared transaction . Must
be called outside of a transaction (i.e. no execute() calls since the
last commit/rollback).

rollback_prepared(xid) - rollback a previously prepared transaction.

The idea being that this should be enough to plug psycopg2 into a
transaction manager such as Zope's transaction module or similar.

I understand that this API might not be implementable by other
database adapters, which brings up the question: what would be a good
API?

>From a quick search, I found two other adapters implementing 2pc both
with incompatible APIs:

kinterbasdb implements a Connection.prepare() method, which performs
the first phase and causes a subsequent commit() or rollback() to
complete that transaction.  Transaction identifiers are not exposed by
the API.

pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
add any explicit API to the connection object, but DCOracle2 does have
an incompatible prepare() method used for prepared statements.


So is there any recommendations for what a two-phase commit API should
look like?

James.

From mal at egenix.com  Fri Jan 18 10:31:17 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jan 2008 10:31:17 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
Message-ID: <479071E5.7000000@egenix.com>

On 2008-01-18 09:37, James Henstridge wrote:
> Hello,
> 
> I've been looking at implementing two phase commit for the psycopg2
> driver for PostgreSQL.  It was suggested that I bring up the issue on
> this list to see if there were any suggestions about what form the API
> should take.
> 
> The API from the initial patch I produced stuck pretty close to the
> PostgreSQL API, adding three methods to the connection object:
> 
> prepare_transaction(xid) - prepare the transaction, using the given
> ID.  This closes off the transaction, allowing a new one to be started
> (if needed).
> 
> commit_prepared(xid) - commit a previously prepared transaction . Must
> be called outside of a transaction (i.e. no execute() calls since the
> last commit/rollback).
> 
> rollback_prepared(xid) - rollback a previously prepared transaction.
> 
> The idea being that this should be enough to plug psycopg2 into a
> transaction manager such as Zope's transaction module or similar.

Zope doesn't require any specific additional APIs to hook
the database module into its transaction mechanism. While
you do need a wrapper (the Zope DA), the three methods used
by the Zope TM easily map onto the standard .commit() and
.rollback() methods of the database interface.

> I understand that this API might not be implementable by other
> database adapters, which brings up the question: what would be a good
> API?
> 
>>From a quick search, I found two other adapters implementing 2pc both
> with incompatible APIs:
> 
> kinterbasdb implements a Connection.prepare() method, which performs
> the first phase and causes a subsequent commit() or rollback() to
> complete that transaction.  Transaction identifiers are not exposed by
> the API.
> 
> pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
> add any explicit API to the connection object, but DCOracle2 does have
> an incompatible prepare() method used for prepared statements.

pymqi is a wrapper for IBM MQSeries which can act as XA-compliant
two-phase commit transaction manager (TM). For this to work, the underlying
database interface has to be compatible to the XA specification,
which is essentially a C interface used directly by the TM.

Note that XA implements transactions completely outside the
normal scope of the Python database module, ie. you may not
call .commit() or .rollback() on the connection objects, but
instead have to register with the XA TM any action that
you plan to have as part of a two-phase commit transaction.

BTW, I'm not sure whether you are interpreting the .prepare() correctly:
this only prepares a statement for later execution, it doesn't
do the first part of a two-phase commit which would be to save
the current transaction log and check whether it could potentially
be committed without problems.

> So is there any recommendations for what a two-phase commit API should
> look like?

It depends a lot on what you're trying to solve.

In general, you usually have to adjust to an existing
transaction manager and that then defines the interface
to use. The are two industry standards for this: XA (X/Open)
and DTC (MS).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 18 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From jeff at taupro.com  Fri Jan 18 11:40:55 2008
From: jeff at taupro.com (Jeff Rush)
Date: Fri, 18 Jan 2008 04:40:55 -0600
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <479071E5.7000000@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com>
Message-ID: <47908237.6010001@taupro.com>

M.-A. Lemburg wrote:
> On 2008-01-18 09:37, James Henstridge wrote:
>>
>> I've been looking at implementing two phase commit for the psycopg2
>> driver for PostgreSQL.  It was suggested that I bring up the issue on
>> this list to see if there were any suggestions about what form the API
>> should take.

Thank you!  I've been wanting a two-phase commit API for a long time, to use
with Zope myself.


>> The API from the initial patch I produced stuck pretty close to the
>> PostgreSQL API, adding three methods to the connection object:
>>
>> prepare_transaction(xid) - prepare the transaction, using the given
>> ID.  This closes off the transaction, allowing a new one to be started
>> (if needed).
>>
>> commit_prepared(xid) - commit a previously prepared transaction . Must
>> be called outside of a transaction (i.e. no execute() calls since the
>> last commit/rollback).
>>
>> rollback_prepared(xid) - rollback a previously prepared transaction.
>>
>> The idea being that this should be enough to plug psycopg2 into a
>> transaction manager such as Zope's transaction module or similar.
> 
> Zope doesn't require any specific additional APIs to hook
> the database module into its transaction mechanism. While
> you do need a wrapper (the Zope DA), the three methods used
> by the Zope TM easily map onto the standard .commit() and
> .rollback() methods of the database interface.

To meet the atomicity requirement of ACID, Zope does need additional APIs, to
expose hooks into its two-phase mechanism.  If you only have access to the
conventional .commit() and .rollback() methods of the database interface, you
cannot handle this case:

1. You have made a change to the ZODB and to a record in the PostgreSQL
   database, which are part of a single transaction.

2. The Zope TM invokes the .commit() method of the PostgreSQL interface.

3. Then Zope TM invokes the .commit() method of the ZODB interface, which
   fails for some reason (say a WriteConflict) -- now it is too late to
   rollback the PostgreSQL commit and you're hosed.


>> kinterbasdb implements a Connection.prepare() method, which performs
>> the first phase and causes a subsequent commit() or rollback() to
>> complete that transaction.  Transaction identifiers are not exposed by
>> the API.
>>
>> pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
>> add any explicit API to the connection object, but DCOracle2 does have
>> an incompatible prepare() method used for prepared statements.
> 
> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant
> two-phase commit transaction manager (TM). For this to work, the underlying
> database interface has to be compatible to the XA specification,
> which is essentially a C interface used directly by the TM.
> 
> Note that XA implements transactions completely outside the
> normal scope of the Python database module, ie. you may not
> call .commit() or .rollback() on the connection objects, but
> instead have to register with the XA TM any action that
> you plan to have as part of a two-phase commit transaction.
> 
> BTW, I'm not sure whether you are interpreting the .prepare() correctly:
> this only prepares a statement for later execution, it doesn't
> do the first part of a two-phase commit which would be to save
> the current transaction log and check whether it could potentially
> be committed without problems.

Which .prepare() are you referring to as possibly misinterpreted - that for
his notes about kinterbasdb, pymqi or PostgreSQL?

-Jeff

From nand_rathi at yahoo.com  Fri Jan 18 12:11:54 2008
From: nand_rathi at yahoo.com (Nand Rathi)
Date: Fri, 18 Jan 2008 03:11:54 -0800 (PST)
Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol
Message-ID: <682465.98796.qm@web57007.mail.re3.yahoo.com>

Hello All

Greetings

I see a current thread regarding 2PC protocol, but my
requirement is little different.

I need to write some python programs which will access
2 databases simultaneously (Oracle & Postgresql). I
need to use 2PC to maintain the transaction integrity.

Is there a python module available which can provide
me the 2PC facility? The application doesn't have a
need to use Zope though ;-(

Can you please guide me appropriately?

regards

N


      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs

From mal at egenix.com  Fri Jan 18 12:29:27 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jan 2008 12:29:27 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <47908237.6010001@taupro.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>	<479071E5.7000000@egenix.com>
	<47908237.6010001@taupro.com>
Message-ID: <47908D97.1080304@egenix.com>

On 2008-01-18 11:40, Jeff Rush wrote:
> M.-A. Lemburg wrote:
>> On 2008-01-18 09:37, James Henstridge wrote:
>>> I've been looking at implementing two phase commit for the psycopg2
>>> driver for PostgreSQL.  It was suggested that I bring up the issue on
>>> this list to see if there were any suggestions about what form the API
>>> should take.
> 
> Thank you!  I've been wanting a two-phase commit API for a long time, to use
> with Zope myself.
> 
> 
>>> The API from the initial patch I produced stuck pretty close to the
>>> PostgreSQL API, adding three methods to the connection object:
>>>
>>> prepare_transaction(xid) - prepare the transaction, using the given
>>> ID.  This closes off the transaction, allowing a new one to be started
>>> (if needed).
>>>
>>> commit_prepared(xid) - commit a previously prepared transaction . Must
>>> be called outside of a transaction (i.e. no execute() calls since the
>>> last commit/rollback).
>>>
>>> rollback_prepared(xid) - rollback a previously prepared transaction.
>>>
>>> The idea being that this should be enough to plug psycopg2 into a
>>> transaction manager such as Zope's transaction module or similar.
>> Zope doesn't require any specific additional APIs to hook
>> the database module into its transaction mechanism. While
>> you do need a wrapper (the Zope DA), the three methods used
>> by the Zope TM easily map onto the standard .commit() and
>> .rollback() methods of the database interface.
> 
> To meet the atomicity requirement of ACID, Zope does need additional APIs, to
> expose hooks into its two-phase mechanism.  If you only have access to the
> conventional .commit() and .rollback() methods of the database interface, you
> cannot handle this case:
> 
> 1. You have made a change to the ZODB and to a record in the PostgreSQL
>    database, which are part of a single transaction.
> 
> 2. The Zope TM invokes the .commit() method of the PostgreSQL interface.
> 
> 3. Then Zope TM invokes the .commit() method of the ZODB interface, which
>    fails for some reason (say a WriteConflict) -- now it is too late to
>    rollback the PostgreSQL commit and you're hosed.

While this would seem desirable, it is not how the Zope TM
works.

Phase 1 is implemented by doing a vote on the success
of the transaction. Phase 2 then finishes or aborts the transaction
depending on the vote.

If something fails in phase 2, there's no guarantee that partial
commits can be undone.

The .commit()/.rollback() calls on the database interface would
be implemented in the phase 2 part.

To avoid your scenario, the ZODB would have to detect the conflict
during phase 1 (ie. the voting phase).

>>> kinterbasdb implements a Connection.prepare() method, which performs
>>> the first phase and causes a subsequent commit() or rollback() to
>>> complete that transaction.  Transaction identifiers are not exposed by
>>> the API.
>>>
>>> pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
>>> add any explicit API to the connection object, but DCOracle2 does have
>>> an incompatible prepare() method used for prepared statements.
>> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant
>> two-phase commit transaction manager (TM). For this to work, the underlying
>> database interface has to be compatible to the XA specification,
>> which is essentially a C interface used directly by the TM.
>>
>> Note that XA implements transactions completely outside the
>> normal scope of the Python database module, ie. you may not
>> call .commit() or .rollback() on the connection objects, but
>> instead have to register with the XA TM any action that
>> you plan to have as part of a two-phase commit transaction.
>>
>> BTW, I'm not sure whether you are interpreting the .prepare() correctly:
>> this only prepares a statement for later execution, it doesn't
>> do the first part of a two-phase commit which would be to save
>> the current transaction log and check whether it could potentially
>> be committed without problems.
> 
> Which .prepare() are you referring to as possibly misinterpreted - that for
> his notes about kinterbasdb, pymqi or PostgreSQL?

That of DCOracle2.

The cursor.prepare() method is a DB-API extension that we've discussed
a couple of times.

Its intent it to prepare the execution of
an SQL command on the cursor, ie. parse it, prepare the access
path on the server and possibly fetch the parameter type information
from the server as well.

Using the .prepare() method you can detect errors in the SQL
before actually running the statement with data. It also allows
setting up a pool of cursor objects that are intended to each
only execute one type of SQL command, e.g. to enhance performance
for recurring SQL commands.

I'm not aware of any discussion on a connection.prepare()
method.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 18 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Fri Jan 18 12:29:28 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 20:29:28 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <479071E5.7000000@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com>
Message-ID: <a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>

On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-18 09:37, James Henstridge wrote:
> > Hello,
> >
> > I've been looking at implementing two phase commit for the psycopg2
> > driver for PostgreSQL.  It was suggested that I bring up the issue on
> > this list to see if there were any suggestions about what form the API
> > should take.
> >
> > The API from the initial patch I produced stuck pretty close to the
> > PostgreSQL API, adding three methods to the connection object:
> >
> > prepare_transaction(xid) - prepare the transaction, using the given
> > ID.  This closes off the transaction, allowing a new one to be started
> > (if needed).
> >
> > commit_prepared(xid) - commit a previously prepared transaction . Must
> > be called outside of a transaction (i.e. no execute() calls since the
> > last commit/rollback).
> >
> > rollback_prepared(xid) - rollback a previously prepared transaction.
> >
> > The idea being that this should be enough to plug psycopg2 into a
> > transaction manager such as Zope's transaction module or similar.
>
> Zope doesn't require any specific additional APIs to hook
> the database module into its transaction mechanism. While
> you do need a wrapper (the Zope DA), the three methods used
> by the Zope TM easily map onto the standard .commit() and
> rollback() methods of the database interface.

I already have an understanding of how the Zope transaction manager
works.  The point is:

1. The database adapter needs to provide some API for use by a Zope
DataManager.  This API needs to co-exist with the standard DB-API
transaction handling.

2. If the database adapter is going to provide an API used to
implement two-phase commit, does it make sense to standardise such an
API across different database adaptrers?  This leads on to the
question I asked in my previous email:

> > I understand that this API might not be implementable by other
> > database adapters, which brings up the question: what would be a good
> > API?


> >>From a quick search, I found two other adapters implementing 2pc both
> > with incompatible APIs:
> >
> > kinterbasdb implements a Connection.prepare() method, which performs
> > the first phase and causes a subsequent commit() or rollback() to
> > complete that transaction.  Transaction identifiers are not exposed by
> > the API.
> >
> > pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
> > add any explicit API to the connection object, but DCOracle2 does have
> > an incompatible prepare() method used for prepared statements.
>
> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant
> two-phase commit transaction manager (TM). For this to work, the underlying
> database interface has to be compatible to the XA specification,
> which is essentially a C interface used directly by the TM.
>
> Note that XA implements transactions completely outside the
> normal scope of the Python database module, ie. you may not
> call .commit() or .rollback() on the connection objects, but
> instead have to register with the XA TM any action that
> you plan to have as part of a two-phase commit transaction.

Yep.  I am not sure how easy it would be to expose a Python level
two-phase commit API for DCOracle2 -- I just brought it up as an
example of a database adapter that people are using with a transaction
manager (albeit at the C level).


> BTW, I'm not sure whether you are interpreting the .prepare() correctly:
> this only prepares a statement for later execution, it doesn't
> do the first part of a two-phase commit which would be to save
> the current transaction log and check whether it could potentially
> be committed without problems.

I guess I was a bit unclear.  When I said that DCOracle had an
incompatible Connection.prepare() method, I meant that it is
incompatible with respsect to kinterbasdb's Connection.prepare().

Therefore, standardising a prepare() method for use in two-phase
commit would be problematic.


> > So is there any recommendations for what a two-phase commit API should
> > look like?
>
> It depends a lot on what you're trying to solve.
>
> In general, you usually have to adjust to an existing
> transaction manager and that then defines the interface
> to use. The are two industry standards for this: XA (X/Open)
> and DTC (MS).

I realise that tying a database adapter into a transaction manager
will often involve some level of database-specific code.

I just wonder if there is enough commonality to justify some level of
standardisation.  It seems silly for everyone to do things differently
for no good reason.

James.

From james at jamesh.id.au  Fri Jan 18 12:36:44 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 20:36:44 +0900
Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol
In-Reply-To: <682465.98796.qm@web57007.mail.re3.yahoo.com>
References: <682465.98796.qm@web57007.mail.re3.yahoo.com>
Message-ID: <a7e835d40801180336x1af84aa8k637624c23d75d6f6@mail.gmail.com>

On 18/01/2008, Nand Rathi <nand_rathi at yahoo.com> wrote:
> Hello All
>
> Greetings
>
> I see a current thread regarding 2PC protocol, but my
> requirement is little different.
>
> I need to write some python programs which will access
> 2 databases simultaneously (Oracle & Postgresql). I
> need to use 2PC to maintain the transaction integrity.

If you are only accessing two databases, you only need 2PC support on
one of them.  The protocol would be something like this:

1. Prepare the transaction for 2PC on the first connection.
2. If the transaction could not be prepared, rollback both connection.
3. If the transaction could be prepared, commit the second connection.
4. If the second connection committed successfully, complete the
transaction on the first connection
5. If the second connection failed to commit, rollback the prepared
transaction on the second connection.

The patch I did for psycopg2 should let you perform 2PC, so could be
used as above whether or not the Oracle adapter you are using supports
it.

James.

From james at jamesh.id.au  Fri Jan 18 13:05:39 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 21:05:39 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <47908D97.1080304@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com> <47908237.6010001@taupro.com>
	<47908D97.1080304@egenix.com>
Message-ID: <a7e835d40801180405p7bab327dp2a152759fdd8bbf8@mail.gmail.com>

On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> While this would seem desirable, it is not how the Zope TM
> works.
>
> Phase 1 is implemented by doing a vote on the success
> of the transaction. Phase 2 then finishes or aborts the transaction
> depending on the vote.
>
> If something fails in phase 2, there's no guarantee that partial
> commits can be undone.
>
> The .commit()/.rollback() calls on the database interface would
> be implemented in the phase 2 part.
>
> To avoid your scenario, the ZODB would have to detect the conflict
> during phase 1 (ie. the voting phase).

Looking at the IDataManager API, it looks like it looks like the
correct way to implement two phase commit would be:

1. if tpc_begin() is called, note that two-phase commit is being used.
2. in commit(), simply prepare the transaction if the two-phase commit
flag is set, rather than actually committing.  If this fails, the
transaction obviously fails.
3. make tpc_vote() a no-op.
4. tpc_finish() commits the prepared transaction
5. abort() and tpc_abort() roll back the prepared transaction (if one
was prepared).

James.

From mal at egenix.com  Fri Jan 18 13:20:32 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jan 2008 13:20:32 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>	<479071E5.7000000@egenix.com>
	<a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>
Message-ID: <47909990.3080804@egenix.com>

On 2008-01-18 12:29, James Henstridge wrote:
> On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 2008-01-18 09:37, James Henstridge wrote:
>>> Hello,
>>>
>>> I've been looking at implementing two phase commit for the psycopg2
>>> driver for PostgreSQL.  It was suggested that I bring up the issue on
>>> this list to see if there were any suggestions about what form the API
>>> should take.
>>>
>>> The API from the initial patch I produced stuck pretty close to the
>>> PostgreSQL API, adding three methods to the connection object:
>>>
>>> prepare_transaction(xid) - prepare the transaction, using the given
>>> ID.  This closes off the transaction, allowing a new one to be started
>>> (if needed).
>>>
>>> commit_prepared(xid) - commit a previously prepared transaction . Must
>>> be called outside of a transaction (i.e. no execute() calls since the
>>> last commit/rollback).
>>>
>>> rollback_prepared(xid) - rollback a previously prepared transaction.
>>>
>>> The idea being that this should be enough to plug psycopg2 into a
>>> transaction manager such as Zope's transaction module or similar.
>> Zope doesn't require any specific additional APIs to hook
>> the database module into its transaction mechanism. While
>> you do need a wrapper (the Zope DA), the three methods used
>> by the Zope TM easily map onto the standard .commit() and
>> rollback() methods of the database interface.
> 
> I already have an understanding of how the Zope transaction manager
> works.  The point is:
> 
> 1. The database adapter needs to provide some API for use by a Zope
> DataManager.  This API needs to co-exist with the standard DB-API
> transaction handling.
> 
> 2. If the database adapter is going to provide an API used to
> implement two-phase commit, does it make sense to standardise such an
> API across different database adaptrers?  This leads on to the
> question I asked in my previous email:
>
>>> I understand that this API might not be implementable by other
>>> database adapters, which brings up the question: what would be a good
>>> API?
> 
> 
>>> >From a quick search, I found two other adapters implementing 2pc both
>>> with incompatible APIs:
>>>
>>> kinterbasdb implements a Connection.prepare() method, which performs
>>> the first phase and causes a subsequent commit() or rollback() to
>>> complete that transaction.  Transaction identifiers are not exposed by
>>> the API.
>>>
>>> pymqi provides a patch to the DCOracle2 adapter.  It doesn't seem to
>>> add any explicit API to the connection object, but DCOracle2 does have
>>> an incompatible prepare() method used for prepared statements.
>> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant
>> two-phase commit transaction manager (TM). For this to work, the underlying
>> database interface has to be compatible to the XA specification,
>> which is essentially a C interface used directly by the TM.
>>
>> Note that XA implements transactions completely outside the
>> normal scope of the Python database module, ie. you may not
>> call .commit() or .rollback() on the connection objects, but
>> instead have to register with the XA TM any action that
>> you plan to have as part of a two-phase commit transaction.
> 
> Yep.  I am not sure how easy it would be to expose a Python level
> two-phase commit API for DCOracle2 -- I just brought it up as an
> example of a database adapter that people are using with a transaction
> manager (albeit at the C level).
>
>> BTW, I'm not sure whether you are interpreting the .prepare() correctly:
>> this only prepares a statement for later execution, it doesn't
>> do the first part of a two-phase commit which would be to save
>> the current transaction log and check whether it could potentially
>> be committed without problems.
> 
> I guess I was a bit unclear.  When I said that DCOracle had an
> incompatible Connection.prepare() method, I meant that it is
> incompatible with respsect to kinterbasdb's Connection.prepare().
> 
> Therefore, standardising a prepare() method for use in two-phase
> commit would be problematic.

Thanks for the clarification. I was thinking of the cursor.prepare()
method.

>>> So is there any recommendations for what a two-phase commit API should
>>> look like?
>> It depends a lot on what you're trying to solve.
>>
>> In general, you usually have to adjust to an existing
>> transaction manager and that then defines the interface
>> to use. The are two industry standards for this: XA (X/Open)
>> and DTC (MS).
> 
> I realise that tying a database adapter into a transaction manager
> will often involve some level of database-specific code.
> 
> I just wonder if there is enough commonality to justify some level of
> standardisation.  It seems silly for everyone to do things differently
> for no good reason.

Agreed, but the need for such interfaces only comes up if you
plan to implement the transaction manager (TM) itself in Python and
need to use the database module as resource manager (RM).

I don't know how this would work with DTC (have never used it,
but it appears to be similar to XA). With XA, the RM has to
provide a C struct defining a set of C function hooks (the XA
switch). This is then used by the TM to implement two-phase
commits.

Now, if a database provides such an XA interface, this could
also be made available to a Python TM. You'd then have to
open the connection via this XA interface rather than the
standard connection constructor (or pass in a parameter
to this constructor to make it use the XA open instead of
the RM open).

Perhaps we could piggy-back the XA-style interface onto
the connection interface and its constructor and turn it
into an XA DB-API extension ?!

XA Spec:
http://www.opengroup.org/products/publications/catalog/c193.htm

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 18 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at egenix.com  Fri Jan 18 13:28:17 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jan 2008 13:28:17 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <a7e835d40801180405p7bab327dp2a152759fdd8bbf8@mail.gmail.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>	<479071E5.7000000@egenix.com>
	<47908237.6010001@taupro.com>	<47908D97.1080304@egenix.com>
	<a7e835d40801180405p7bab327dp2a152759fdd8bbf8@mail.gmail.com>
Message-ID: <47909B61.8080501@egenix.com>

On 2008-01-18 13:05, James Henstridge wrote:
> On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
>> While this would seem desirable, it is not how the Zope TM
>> works.
>>
>> Phase 1 is implemented by doing a vote on the success
>> of the transaction. Phase 2 then finishes or aborts the transaction
>> depending on the vote.
>>
>> If something fails in phase 2, there's no guarantee that partial
>> commits can be undone.
>>
>> The .commit()/.rollback() calls on the database interface would
>> be implemented in the phase 2 part.
>>
>> To avoid your scenario, the ZODB would have to detect the conflict
>> during phase 1 (ie. the voting phase).
> 
> Looking at the IDataManager API, it looks like it looks like the
> correct way to implement two phase commit would be:
> 
> 1. if tpc_begin() is called, note that two-phase commit is being used.
> 2. in commit(), simply prepare the transaction if the two-phase commit
> flag is set, rather than actually committing.  If this fails, the
> transaction obviously fails.
> 3. make tpc_vote() a no-op.
> 4. tpc_finish() commits the prepared transaction
> 5. abort() and tpc_abort() roll back the prepared transaction (if one
> was prepared).

Agreed, but at least for Zope database adapters, that's not what's
implemented (have a look at ZRDB/TM.py).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 18 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From stuart at stuartbishop.net  Fri Jan 18 14:07:24 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Fri, 18 Jan 2008 20:07:24 +0700
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
Message-ID: <4790A48C.2030204@stuartbishop.net>

James Henstridge wrote:

> So is there any recommendations for what a two-phase commit API should
> look like?

Here are the three obvious possibilities. The first is what you already
have. The other two also allow access to all of PostgreSQL's two phase
commit API and are functionally identical. All would work fine for
integrating with the transaction managers I'm familiar with (Z2, Z3, Storm).
The difference is just spelling. Any opinions?


conn = connect([...])
[... do work ...]
try:
    xid = conn.prepare_transaction('xid%f' % random())
    [... prepare other participants ...]
except:
    conn.rollback_prepared(xid)
else:
    conn.commit_prepared(xid)


conn = connect([...])
[... do work ...]
try:
    trans = conn.prepare_transaction('xid%f' % random())
    [... prepare other participants ...]
except:
    trans.rollback()
else:
    trans.commit()


conn = connect([...])
[... do work ...]
try:
    trans = PreparedTransaction(con, 'xid%f' % random())
    [... prepare other participants ...]
except:
    trans.rollback()
else:
    trans.commit()

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/ccb4b04a/attachment.pgp 

From stuart at stuartbishop.net  Fri Jan 18 14:20:37 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Fri, 18 Jan 2008 20:20:37 +0700
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <47909990.3080804@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>	<479071E5.7000000@egenix.com>	<a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>
	<47909990.3080804@egenix.com>
Message-ID: <4790A7A5.5060104@stuartbishop.net>

M.-A. Lemburg wrote:

> Now, if a database provides such an XA interface, this could
> also be made available to a Python TM. You'd then have to
> open the connection via this XA interface rather than the
> standard connection constructor (or pass in a parameter
> to this constructor to make it use the XA open instead of
> the RM open).
> 
> Perhaps we could piggy-back the XA-style interface onto
> the connection interface and its constructor and turn it
> into an XA DB-API extension ?!

If that has more than just 'prepare_transaction', 'commit_transaction' and
'rollback_transaction' it has no place in the DB-API IMO. These three
actions are the entirety of what PostgreSQL provides and are the building
blocks you need to build anything more complex (including XA). We don't need
driver authors  to build a transaction manager. We just need driver authors
to provide the building blocks for DB-API connections to be integrated with
transaction managers.


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/4d4a31cd/attachment.pgp 

From stuart at stuartbishop.net  Fri Jan 18 14:33:19 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Fri, 18 Jan 2008 20:33:19 +0700
Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol
In-Reply-To: <a7e835d40801180336x1af84aa8k637624c23d75d6f6@mail.gmail.com>
References: <682465.98796.qm@web57007.mail.re3.yahoo.com>
	<a7e835d40801180336x1af84aa8k637624c23d75d6f6@mail.gmail.com>
Message-ID: <4790AA9F.5000004@stuartbishop.net>

James Henstridge wrote:

> The patch I did for psycopg2 should let you perform 2PC, so could be
> used as above whether or not the Oracle adapter you are using supports
> it.

You can also do this right now if you don't mind it being ugly:

con = psycopg.connect('')
[... do stuff ...]
xid = 'xid%f' % random()

cur = con.cursor()
cur.execute('PREPARE TRANSACTION %s', [xid])
try:
    [... commit oracle ...]
except:
    cur.execute('ROLLBACK PREPARED %s', [xid])
else:
    cur.execute('COMMIT PREPARED %s', [xid])


You might be able to do the same trick with Oracle, allowing you to handle
more than 2 Oracle connections safely.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/5d25acfd/attachment.pgp 

From mal at egenix.com  Fri Jan 18 14:44:02 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jan 2008 14:44:02 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <4790A7A5.5060104@stuartbishop.net>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>	<479071E5.7000000@egenix.com>	<a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>	<47909990.3080804@egenix.com>
	<4790A7A5.5060104@stuartbishop.net>
Message-ID: <4790AD22.9060105@egenix.com>

On 2008-01-18 14:20, Stuart Bishop wrote:
> M.-A. Lemburg wrote:
> 
>> Now, if a database provides such an XA interface, this could
>> also be made available to a Python TM. You'd then have to
>> open the connection via this XA interface rather than the
>> standard connection constructor (or pass in a parameter
>> to this constructor to make it use the XA open instead of
>> the RM open).
>>
>> Perhaps we could piggy-back the XA-style interface onto
>> the connection interface and its constructor and turn it
>> into an XA DB-API extension ?!
> 
> If that has more than just 'prepare_transaction', 'commit_transaction' and
> 'rollback_transaction' it has no place in the DB-API IMO. These three
> actions are the entirety of what PostgreSQL provides and are the building
> blocks you need to build anything more complex (including XA). We don't need
> driver authors  to build a transaction manager. We just need driver authors
> to provide the building blocks for DB-API connections to be integrated with
> transaction managers.

The XA API is a bit more complex than just the three APIs you
mention (with "prepare_transaction" meaning "prepare to commit
a transaction"):

http://www.opengroup.org/onlinepubs/009680699/toc.pdf

I'm not suggesting that we need to have all those APIs, but the
essential APIs need to be present, ie. you need to be able to:

 * put a connection under TM control or associate with a TM
   transaction (xa_open/xa_start/ax_reg)
 * prepare to commit a TM transaction (xa_prepare)
 * finally commit a TM transaction (xa_commit)
 * finally rollback a TM transaction (xa_rollback)
 * release the connection from TM control (xa_close/xa_end/ax_unreg)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 18 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Fri Jan 18 15:11:19 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 23:11:19 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <47909B61.8080501@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com> <47908237.6010001@taupro.com>
	<47908D97.1080304@egenix.com>
	<a7e835d40801180405p7bab327dp2a152759fdd8bbf8@mail.gmail.com>
	<47909B61.8080501@egenix.com>
Message-ID: <a7e835d40801180611j74d06a0dxa6ea3358ea58c1ae@mail.gmail.com>

On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> > Looking at the IDataManager API, it looks like it looks like the
> > correct way to implement two phase commit would be:
> >
> > 1. if tpc_begin() is called, note that two-phase commit is being used.
> > 2. in commit(), simply prepare the transaction if the two-phase commit
> > flag is set, rather than actually committing.  If this fails, the
> > transaction obviously fails.
> > 3. make tpc_vote() a no-op.
> > 4. tpc_finish() commits the prepared transaction
> > 5. abort() and tpc_abort() roll back the prepared transaction (if one
> > was prepared).
>
> Agreed, but at least for Zope database adapters, that's not what's
> implemented (have a look at ZRDB/TM.py).

This looks pretty much the same as the Zope 3 zope.app.rdb case: the
default DataManager implementation provided by Zope doesn't support
two-phase commit, but it is possible for an adapter to provide its own
DataManager implementation.

This isn't too surprising when you consider that there is no standard
two-phase commit API for database adapters.

James.

From james at jamesh.id.au  Fri Jan 18 15:36:51 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 18 Jan 2008 23:36:51 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <4790AD22.9060105@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com>
	<a7e835d40801180329k10a41dd2x31ce83f6716fc6d@mail.gmail.com>
	<47909990.3080804@egenix.com> <4790A7A5.5060104@stuartbishop.net>
	<4790AD22.9060105@egenix.com>
Message-ID: <a7e835d40801180636n6d777b44u3cdcf12d36cb9cf3@mail.gmail.com>

On 18/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> > If that has more than just 'prepare_transaction', 'commit_transaction' and
> > 'rollback_transaction' it has no place in the DB-API IMO. These three
> > actions are the entirety of what PostgreSQL provides and are the building
> > blocks you need to build anything more complex (including XA). We don't need
> > driver authors  to build a transaction manager. We just need driver authors
> > to provide the building blocks for DB-API connections to be integrated with
> > transaction managers.
>
> The XA API is a bit more complex than just the three APIs you
> mention (with "prepare_transaction" meaning "prepare to commit
> a transaction"):
[snip]

It is worth noting that the JDBC driver implements the Java variant of
XA on top of the three primitives Stuart mentioned.  The remaining
parts are mainly policy of when to use those primitives.


As another data point on 2PC APIs, I found that the cx_Oracle driver
provides such an API:
    http://cx-oracle.sourceforge.net/html/connobj.html

It is similar to kinterbasdb's one in that it uses
prepare()/commit()/rollback() methods on the connection, but it also
sounds like it requires you to call a begin() method to start a
transaction.

James.

From dieter at handshake.de  Fri Jan 18 20:32:37 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Fri, 18 Jan 2008 20:32:37 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <47908D97.1080304@egenix.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com> <47908237.6010001@taupro.com>
	<47908D97.1080304@egenix.com>
Message-ID: <18320.65237.930087.237768@gargle.gargle.HOWL>

M.-A. Lemburg wrote at 2008-1-18 12:29 +0100:
> ...
>While this would seem desirable, it is not how the Zope TM
>works.
>
>Phase 1 is implemented by doing a vote on the success
>of the transaction. Phase 2 then finishes or aborts the transaction
>depending on the vote.
>
>If something fails in phase 2, there's no guarantee that partial
>commits can be undone.
>
>The .commit()/.rollback() calls on the database interface would
>be implemented in the phase 2 part.
>
>To avoid your scenario, the ZODB would have to detect the conflict
>during phase 1 (ie. the voting phase).

It does this indeed.

And it assumes that a resource manager accepts a vote only
when it can garantee that the subsequent "commit" will succeed (and
does not fail).

A resource manager needs to expose both a "vote" (with the above garantee)
and a "commit" in order to be a first class participant of
Zope's transaction system.

Relational database interfaces often lack the equivalent of a "vote".


-- 
Dieter

From james at jamesh.id.au  Mon Jan 21 06:00:42 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 14:00:42 +0900
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <18320.65237.930087.237768@gargle.gargle.HOWL>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com> <47908237.6010001@taupro.com>
	<47908D97.1080304@egenix.com>
	<18320.65237.930087.237768@gargle.gargle.HOWL>
Message-ID: <a7e835d40801202100h22920ac9wb33b348d50cc7a30@mail.gmail.com>

On 19/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
> It does this indeed.
>
> And it assumes that a resource manager accepts a vote only
> when it can garantee that the subsequent "commit" will succeed (and
> does not fail).
>
> A resource manager needs to expose both a "vote" (with the above garantee)
> and a "commit" in order to be a first class participant of
> Zope's transaction system.
>
> Relational database interfaces often lack the equivalent of a "vote".

I'd disagree with this description.  From the Zope transaction
documentation, the order of methods is:

    tpc_begin commit tpc_vote (tpc_finish | tpc_abort)

>From the descriptions of the various methods, a database adapter
supporting 2PC would prepare the transaction at commit(), and commit
or rollback that transaction in tpc_finish or tpc_abort respectively.

After preparing the transaction, the transaction should be committable
under normal circumstances, so it would have no reason to vote no as
part of tpc_vote().

I disagree that the lack of a tpc_vote() method makes the database
adapter a second class citizen: it simply reflects the fact that the
adapter makes up its mind at the commit() stage independent of what
other data managers do.

James.

From james at jamesh.id.au  Mon Jan 21 11:08:28 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 19:08:28 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for
	two phase commit APIs?)
Message-ID: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>

On 18/01/2008, James Henstridge <james at jamesh.id.au> wrote:
> So is there any recommendations for what a two-phase commit API should
> look like?

I did a bit of investigation into a few databases, and came up with a
proposal for an extension to the DB-API.

I know that there are a few incomplete portions of the proposal, so
I'd appreciate feedback.  If you have knowledge of a database not
covered here, please comment on whether the proposed API would be
workable in that context.

Re: the confusion between "prepared transactions" vs. "prepared
statements" support, this probably won't conflict since the prepared
statement extensions I saw used Cursor.prepare() rather than
Connection.prepare().  If it is a problem though, the proposal could
be modified to use prepare_transaction() or similar.

James.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: two-phase-commit.txt
Url: http://mail.python.org/pipermail/db-sig/attachments/20080121/2ffb5544/attachment.txt 

From fog at initd.org  Mon Jan 21 11:28:55 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Mon, 21 Jan 2008 11:28:55 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
Message-ID: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio>

I agree with your analisys, I'll add some comments about the proposal
below.

Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto:
>  1. Add a Connection.begin(...) method that explicitly starts a
>     transaction.  Some argument (possibly the transaction ID) causes
>     the transaction to use two-phase commit.  May raise
>     NotSupportedError if two-phase commit is not supported.

DBAPI always had implicit transaction begin (for backends supporting
transactions) and adding an explicit begin() method would just add
confusion onto the user. "Should I always call begin()? Or just when I
want to start a two-phase?". I'd better like the two-phase begin method
named otherwise. Let's call it xa_begin() in this discussion.

>  2. Add a Connection.prepare() method that peforms the first stage of
>     two-phase commit.  May raise NotSupportedError if two-phase commit
>     is not supported, or the transaction was not started in two-phase
>     mode.
> 
Ok. (Should be named accordingly with the begin method.)

>  3. Calling commit() or rollback() on the connection after prepare()
>     performs the second stage of the commit.
> 
Ok.

>  4. Calling commit() or rollback() on the connection prior to
>     prepare() performs a one-phase commit or rollback.
> 
IMHO, it should raise an error if the transaction was started for
two-phase. Otherwise I don't see any reason for (1). 

>  5. Executing statements after prepare() but before commit() or
>     rollback() results in an error (ProgrammingError?)
> 
Ok.

>  6. Closing a connection with a prepared but uncommitted transaction
>     rolls back that transaction.
> 
Stuart's comment on psycopg ML made me think about this one. Maybe we
want an option added to xa_begin() to keep the prepared transaction open
even if the connection drops.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
DISCLAIMER. If I receive a message from you, you are agreeing that:
 1. I am by definition, "the intended recipient".
 2. All information in the email is mine to do with as I see fit and
 make such financial profit, political mileage, or good joke as it lends
 itself to. In particular, I may quote it on USENET or the WWW.
 3. I may take the contents as representing the views of your company.
 4. This overrides any disclaimer or statement of confidentiality that
 may be included on your message.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/d4c155eb/attachment.pgp 

From mal at egenix.com  Mon Jan 21 11:58:49 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jan 2008 11:58:49 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any	standard
 for two phase commit APIs?)
In-Reply-To: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
Message-ID: <47947AE9.7010101@egenix.com>

On 2008-01-21 11:28, Federico Di Gregorio wrote:
> I agree with your analisys, I'll add some comments about the proposal
> below.
> 
> Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto:
>>  1. Add a Connection.begin(...) method that explicitly starts a
>>     transaction.  Some argument (possibly the transaction ID) causes
>>     the transaction to use two-phase commit.  May raise
>>     NotSupportedError if two-phase commit is not supported.
> 
> DBAPI always had implicit transaction begin (for backends supporting
> transactions) and adding an explicit begin() method would just add
> confusion onto the user. "Should I always call begin()? Or just when I
> want to start a two-phase?". I'd better like the two-phase begin method
> named otherwise. Let's call it xa_begin() in this discussion.

Agreed.

I also think that we should prepend all of these methods with
"xa_" or something similar: database backends may need to be to
differentiate whether the user wants to e.g. commit in the context
of a two-phase commit transaction or a regular one and the two-phase
commit is also likely going to require an argument (the transaction id).

Using a different set of methods would also make it clear to
the reader of the code, that a two-phase commit transaction is
happening (which does work a lot different from a one-phase one).

>>  2. Add a Connection.prepare() method that peforms the first stage of
>>     two-phase commit.  May raise NotSupportedError if two-phase commit
>>     is not supported, or the transaction was not started in two-phase
>>     mode.
>>
> Ok. (Should be named accordingly with the begin method.)

.xa_prepare(xid)

>>  3. Calling commit() or rollback() on the connection after prepare()
>>     performs the second stage of the commit.
>>
> Ok.

.xa_commit(xid) and .xa_rollback(xid)

>>  4. Calling commit() or rollback() on the connection prior to
>>     prepare() performs a one-phase commit or rollback.
>>
> IMHO, it should raise an error if the transaction was started for
> two-phase. Otherwise I don't see any reason for (1). 

Agreed. They should raise an error.

In fact, when operating in two-phase commit mode, I think
using the one-phase methods .commit() and .rollback() should
raise an error. Mixing the two is normally not a good idea and
may very well result in an undefined state.

>>  5. Executing statements after prepare() but before commit() or
>>     rollback() results in an error (ProgrammingError?)
>>
> Ok.

Agreed.

>>  6. Closing a connection with a prepared but uncommitted transaction
>>     rolls back that transaction.
>>
> Stuart's comment on psycopg ML made me think about this one. Maybe we
> want an option added to xa_begin() to keep the prepared transaction open
> even if the connection drops.

A connection drop should always trigger an implicit rollback on the
server side, so I'm not sure how and where you'd keep the required
state to continue processing the transaction in case the connection
is reestablished.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 21 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Mon Jan 21 12:09:12 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 20:09:12 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
Message-ID: <a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>

On 21/01/2008, Federico Di Gregorio <fog at initd.org> wrote:
> I agree with your analisys, I'll add some comments about the proposal
> below.
>
> Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto:
> >  1. Add a Connection.begin(...) method that explicitly starts a
> >     transaction.  Some argument (possibly the transaction ID) causes
> >     the transaction to use two-phase commit.  May raise
> >     NotSupportedError if two-phase commit is not supported.
>
> DBAPI always had implicit transaction begin (for backends supporting
> transactions) and adding an explicit begin() method would just add
> confusion onto the user. "Should I always call begin()? Or just when I
> want to start a two-phase?". I'd better like the two-phase begin method
> named otherwise. Let's call it xa_begin() in this discussion.

I don't have a strong opinion here.  I used begin() in the proposal
because the method is currently available in many adapters to
explicitly start a transaction (even though they'll implicitly start a
transaction otherwise).

Extending the method seemed easier, and is what cx_Oracle did.


> >  2. Add a Connection.prepare() method that peforms the first stage of
> >     two-phase commit.  May raise NotSupportedError if two-phase commit
> >     is not supported, or the transaction was not started in two-phase
> >     mode.
> >
> Ok. (Should be named accordingly with the begin method.)

I used prepare() in the proposal because that's what cx_Oracle and
kinterbasdb are already doing.


> >  3. Calling commit() or rollback() on the connection after prepare()
> >     performs the second stage of the commit.
> >
> Ok.
>
> >  4. Calling commit() or rollback() on the connection prior to
> >     prepare() performs a one-phase commit or rollback.
> >
> IMHO, it should raise an error if the transaction was started for
> two-phase. Otherwise I don't see any reason for (1).

I disagree here.  If a problem is detected early in the transaction,
calling prepare() before rollback() on the other members of the global
transaction is a waste of effort.

As for commit(), the transaction manager can use one-phase commit for
the last resource without integrity problems.  I don't see much value
in preventing this optimisation.

> >  5. Executing statements after prepare() but before commit() or
> >     rollback() results in an error (ProgrammingError?)
> >
> Ok.
>
> >  6. Closing a connection with a prepared but uncommitted transaction
> >     rolls back that transaction.
> >
> Stuart's comment on psycopg ML made me think about this one. Maybe we
> want an option added to xa_begin() to keep the prepared transaction open
> even if the connection drops.

Perhaps.  I haven't really thought much about the recovery side of the API.

James.

From fog at initd.org  Mon Jan 21 12:16:17 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Mon, 21 Jan 2008 12:16:17 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>
Message-ID: <1200914177.4685.43.camel@mila.office.dinunzioedigregorio>


Il giorno lun, 21/01/2008 alle 20.09 +0900, James Henstridge ha scritto:
> > IMHO, it should raise an error if the transaction was started for
> > two-phase. Otherwise I don't see any reason for (1).
>
> I disagree here.  If a problem is detected early in the transaction,
> calling prepare() before rollback() on the other members of the global
> transaction is a waste of effort.
>
> As for commit(), the transaction manager can use one-phase commit for
> the last resource without integrity problems.  I don't see much value
> in preventing this optimisation.

I agree on rollback(), not on commit(). If the transaction manager wants
to use one-phase it should do that explicitly. Allowing to call commit
on a two-phase transaction without first preparing it is prone to errors
and can lead to subtle errors like depending on it creating a "standard"
transaction on some backends and not on others. 

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
 When people say things are a lot more complicated than that, they
  means they're getting worried that they won't like the truth.
                                                    -- Granny Weatherwax
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/8661bd0d/attachment.pgp 

From james at jamesh.id.au  Mon Jan 21 12:31:38 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 20:31:38 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <47947AE9.7010101@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<47947AE9.7010101@egenix.com>
Message-ID: <a7e835d40801210331o44c1effawa2b093b41fe94eff@mail.gmail.com>

On 21/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-21 11:28, Federico Di Gregorio wrote:
> > I agree with your analisys, I'll add some comments about the proposal
> > below.
> >
> > Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto:
> >>  1. Add a Connection.begin(...) method that explicitly starts a
> >>     transaction.  Some argument (possibly the transaction ID) causes
> >>     the transaction to use two-phase commit.  May raise
> >>     NotSupportedError if two-phase commit is not supported.
> >
> > DBAPI always had implicit transaction begin (for backends supporting
> > transactions) and adding an explicit begin() method would just add
> > confusion onto the user. "Should I always call begin()? Or just when I
> > want to start a two-phase?". I'd better like the two-phase begin method
> > named otherwise. Let's call it xa_begin() in this discussion.
>
> Agreed.
>
> I also think that we should prepend all of these methods with
> "xa_" or something similar: database backends may need to be to
> differentiate whether the user wants to e.g. commit in the context
> of a two-phase commit transaction or a regular one and the two-phase
> commit is also likely going to require an argument (the transaction id).
>
> Using a different set of methods would also make it clear to
> the reader of the code, that a two-phase commit transaction is
> happening (which does work a lot different from a one-phase one).

I'm indifferent about this.  I don't think using the same
commit/rollback methods presents much confusion.


> >>  2. Add a Connection.prepare() method that peforms the first stage of
> >>     two-phase commit.  May raise NotSupportedError if two-phase commit
> >>     is not supported, or the transaction was not started in two-phase
> >>     mode.
> >>
> > Ok. (Should be named accordingly with the begin method.)
>
> xa_prepare(xid)

In what cases would you pass a different xid to xa_prepare() vs. what
was passed to xa_begin()?

If not, then I'd leave the argument out: I've already told the
connection what the transaction ID is once already.

> >>  3. Calling commit() or rollback() on the connection after prepare()
> >>     performs the second stage of the commit.
> >>
> > Ok.
>
> xa_commit(xid) and .xa_rollback(xid)

Having these arguments would be quite useful for the recovery use-case.

I think it'd be useful to be able to use the methods without an
argument to operate on the current transaction too though.

> >>  4. Calling commit() or rollback() on the connection prior to
> >>     prepare() performs a one-phase commit or rollback.
> >>
> > IMHO, it should raise an error if the transaction was started for
> > two-phase. Otherwise I don't see any reason for (1).
>
> Agreed. They should raise an error.
>
> In fact, when operating in two-phase commit mode, I think
> using the one-phase methods .commit() and .rollback() should
> raise an error. Mixing the two is normally not a good idea and
> may very well result in an undefined state.

If we have separate rollback vs. xa_rollback, then sure.  But some
rollback method should be allowed before preparing the transaction.
The same goes for committing.


> >>  5. Executing statements after prepare() but before commit() or
> >>     rollback() results in an error (ProgrammingError?)
> >>
> > Ok.
>
> Agreed.
>
> >>  6. Closing a connection with a prepared but uncommitted transaction
> >>     rolls back that transaction.
> >>
> > Stuart's comment on psycopg ML made me think about this one. Maybe we
> > want an option added to xa_begin() to keep the prepared transaction open
> > even if the connection drops.
>
> A connection drop should always trigger an implicit rollback on the
> server side, so I'm not sure how and where you'd keep the required
> state to continue processing the transaction in case the connection
> is reestablished.

Uncommitted prepared transactions survive the connection in PostgreSQL
and can be committed from another connection.

Many 2PC-supporting databases provide some way of listing existing
transactions (e.g. MySQL's "XA RECOVER" statement), so I doubt
PostgreSQL is unique here.

At a minimum it'd be helpful to emit a warning in this case.

James.

From stuart at stuartbishop.net  Mon Jan 21 12:31:48 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Mon, 21 Jan 2008 18:31:48 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any	standard
 for two phase commit APIs?)
In-Reply-To: <47947AE9.7010101@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<47947AE9.7010101@egenix.com>
Message-ID: <479482A4.7000504@stuartbishop.net>

M.-A. Lemburg wrote:

> A connection drop should always trigger an implicit rollback on the
> server side, so I'm not sure how and where you'd keep the required
> state to continue processing the transaction in case the connection
> is reestablished.

With PostgreSQL, when you PREPARE TRANSACTION all state is flushed to disk.
If your network drops before you can commit or even if your server catches
fire you can still reconnect later and commit the transaction (provided your
disks survive).

As an example, lets say you are dealing with three data stores and an
exception is raised in the second phase whilst committing the 2nd data store.

If the transaction on the 3rd data store is rolled back then you can only
recover by somehow rolling back the transaction on the 1st and maybe 2nd
data store. Given this is probably a multi user environment this may well
involve data loss.

If the transaction on the 3rd data store is not rolled back, then you can
recover if the problem was transient by simply retrying the outstanding
commits once the network glitch or whatever has been fixed. All you need are
the transaction ids you used (and why meaningful transaction ids can make
your life easier at 2am).

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/090adebf/attachment.pgp 

From james at jamesh.id.au  Mon Jan 21 12:35:43 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 20:35:43 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <1200914177.4685.43.camel@mila.office.dinunzioedigregorio>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>
	<1200914177.4685.43.camel@mila.office.dinunzioedigregorio>
Message-ID: <a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>

On 21/01/2008, Federico Di Gregorio <fog at initd.org> wrote:
>
> Il giorno lun, 21/01/2008 alle 20.09 +0900, James Henstridge ha scritto:
> > > IMHO, it should raise an error if the transaction was started for
> > > two-phase. Otherwise I don't see any reason for (1).
> >
> > I disagree here.  If a problem is detected early in the transaction,
> > calling prepare() before rollback() on the other members of the global
> > transaction is a waste of effort.
> >
> > As for commit(), the transaction manager can use one-phase commit for
> > the last resource without integrity problems.  I don't see much value
> > in preventing this optimisation.
>
> I agree on rollback(), not on commit(). If the transaction manager wants
> to use one-phase it should do that explicitly. Allowing to call commit
> on a two-phase transaction without first preparing it is prone to errors
> and can lead to subtle errors like depending on it creating a "standard"
> transaction on some backends and not on others.

MySQL appears to have a special API for performing a one-phase commit
of an XA transaction:

    XA COMMIT xid ONE PHASE

Perhaps an argument to xa_commit() would be appropriate here?

    connection.xa_commit(onephase=True)

Without the argument, the commit would be considered to be a
ProgrammingError.  That would reduce the chance of programmer error
leading to data corruption.

James.

From fog at initd.org  Mon Jan 21 12:53:05 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Mon, 21 Jan 2008 12:53:05 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>
	<1200914177.4685.43.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>
Message-ID: <1200916385.4685.48.camel@mila.office.dinunzioedigregorio>


Il giorno lun, 21/01/2008 alle 20.35 +0900, James Henstridge ha scritto:
> MySQL appears to have a special API for performing a one-phase commit
> of an XA transaction:
> 
>     XA COMMIT xid ONE PHASE
> 
> Perhaps an argument to xa_commit() would be appropriate here?
> 
>     connection.xa_commit(onephase=True)
> 
> Without the argument, the commit would be considered to be a
> ProgrammingError.  That would reduce the chance of programmer error
> leading to data corruption.

Lets not make an API that has features useful on a single backend. I
suppose the necessity for a one-phase commit in a two-phase transaction
is rare. A simple API means early adoption by most of the adapters.
> 
federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
  We should forget about small efficiencies, say about 97% of the
   time: premature optimization is the root of all evil.    -- D.E.Knuth
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/2e57d56e/attachment.pgp 

From mal at egenix.com  Mon Jan 21 12:57:22 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jan 2008 12:57:22 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any	standard
 for two phase commit APIs?)
In-Reply-To: <479482A4.7000504@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>	<47947AE9.7010101@egenix.com>
	<479482A4.7000504@stuartbishop.net>
Message-ID: <479488A2.5000603@egenix.com>

On 2008-01-21 12:31, Stuart Bishop wrote:
> M.-A. Lemburg wrote:
> 
>> A connection drop should always trigger an implicit rollback on the
>> server side, so I'm not sure how and where you'd keep the required
>> state to continue processing the transaction in case the connection
>> is reestablished.
> 
> With PostgreSQL, when you PREPARE TRANSACTION all state is flushed to disk.
> If your network drops before you can commit or even if your server catches
> fire you can still reconnect later and commit the transaction (provided your
> disks survive).
> 
> As an example, lets say you are dealing with three data stores and an
> exception is raised in the second phase whilst committing the 2nd data store.
> 
> If the transaction on the 3rd data store is rolled back then you can only
> recover by somehow rolling back the transaction on the 1st and maybe 2nd
> data store. Given this is probably a multi user environment this may well
> involve data loss.
> 
> If the transaction on the 3rd data store is not rolled back, then you can
> recover if the problem was transient by simply retrying the outstanding
> commits once the network glitch or whatever has been fixed. All you need are
> the transaction ids you used (and why meaningful transaction ids can make
> your life easier at 2am).

Thanks for the explanations. I was actually thinking of the
connection between the TM and the RM (the database backend).
The typical behavior of a TM is to cancel the ongoing
two-phase commit transaction if an RM becomes unavailable.

However, I can see your point. If the data stays on the
database server and can be addressed via the XID, then a
dropped connection wouldn't hurt all that much.
Then again: how do you tell the database to forget about
the data stored for an XID ?

XA has an xa_forget() API for this, but I'm not sure whether
this is expected to also work across TM-RM reconnects or
whether the TM is actually expected to retry the reconnect
at all.

Im MQ Series apps, the typical behavior would be to put
the data back on the queue and retry the whole transaction
at some later point.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 21 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Mon Jan 21 13:10:07 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 21:10:07 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <479488A2.5000603@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net>
	<479488A2.5000603@egenix.com>
Message-ID: <a7e835d40801210410i3a80b959ib859d9d7abec4e7d@mail.gmail.com>

On 21/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> Thanks for the explanations. I was actually thinking of the
> connection between the TM and the RM (the database backend).
> The typical behavior of a TM is to cancel the ongoing
> two-phase commit transaction if an RM becomes unavailable.

Stuart's use case is if the RM dies during the second phase of the
commit.  It is an edge case, but then 2PC is all about edge cases :)

If it happens before that point, then rolling back is appropriate.


> However, I can see your point. If the data stays on the
> database server and can be addressed via the XID, then a
> dropped connection wouldn't hurt all that much.
> Then again: how do you tell the database to forget about
> the data stored for an XID ?

You ask for the transaction to be rolled back (e.g. "ROLLBACK PREPARED
xid" in PostgreSQL, and "XA ROLLBACK xid" for MySQL).

James.

From james at jamesh.id.au  Mon Jan 21 13:12:17 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 21:12:17 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <1200916385.4685.48.camel@mila.office.dinunzioedigregorio>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>
	<1200914177.4685.43.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>
	<1200916385.4685.48.camel@mila.office.dinunzioedigregorio>
Message-ID: <a7e835d40801210412h625d923doa9bcfb3bbe6a74a@mail.gmail.com>

On 21/01/2008, Federico Di Gregorio <fog at initd.org> wrote:
> > Perhaps an argument to xa_commit() would be appropriate here?
> >
> >     connection.xa_commit(onephase=True)
> >
> > Without the argument, the commit would be considered to be a
> > ProgrammingError.  That would reduce the chance of programmer error
> > leading to data corruption.
>
> Lets not make an API that has features useful on a single backend. I
> suppose the necessity for a one-phase commit in a two-phase transaction
> is rare. A simple API means early adoption by most of the adapters.

Well, Postgres lets you commit a 2PC transaction before preparing it
too (after all, it doesn't know you are using 2PC until you prepare).

Judging by the kinterbasdb and cx_Oracle code, they can do so as well.
 This isn't just a "single backend" feature.

James.

From mal at egenix.com  Mon Jan 21 13:19:25 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jan 2008 13:19:25 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <a7e835d40801210410i3a80b959ib859d9d7abec4e7d@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>	<47947AE9.7010101@egenix.com>
	<479482A4.7000504@stuartbishop.net>	<479488A2.5000603@egenix.com>
	<a7e835d40801210410i3a80b959ib859d9d7abec4e7d@mail.gmail.com>
Message-ID: <47948DCD.6000900@egenix.com>

On 2008-01-21 13:10, James Henstridge wrote:
> On 21/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
>> Thanks for the explanations. I was actually thinking of the
>> connection between the TM and the RM (the database backend).
>> The typical behavior of a TM is to cancel the ongoing
>> two-phase commit transaction if an RM becomes unavailable.
> 
> Stuart's use case is if the RM dies during the second phase of the
> commit.  It is an edge case, but then 2PC is all about edge cases :)
> 
> If it happens before that point, then rolling back is appropriate.
> 
> 
>> However, I can see your point. If the data stays on the
>> database server and can be addressed via the XID, then a
>> dropped connection wouldn't hurt all that much.
>> Then again: how do you tell the database to forget about
>> the data stored for an XID ?
> 
> You ask for the transaction to be rolled back (e.g. "ROLLBACK PREPARED
> xid" in PostgreSQL, and "XA ROLLBACK xid" for MySQL).

Sorry, I wasn't clear enough:

If a connection fails and the transaction XID persists, how do you:

 * identify which XIDs are still pending (xa_recover)

 * tell the RM to drop all resources associacted with an XID
   (xa_forget)

once the TM has reconnected. These APIs appear to be needed
in order for the TM to be able to cleanup the RM after e.g.
a lost connection.

OTOH, perhaps just doing a rollback with the known XID and
ignoring any errors would do the same without the need for
extra APIs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 21 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at egenix.com  Mon Jan 21 13:23:17 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jan 2008 13:23:17 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801210412h625d923doa9bcfb3bbe6a74a@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>	<1200914177.4685.43.camel@mila.office.dinunzioedigregorio>	<a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>	<1200916385.4685.48.camel@mila.office.dinunzioedigregorio>
	<a7e835d40801210412h625d923doa9bcfb3bbe6a74a@mail.gmail.com>
Message-ID: <47948EB5.108@egenix.com>

On 2008-01-21 13:12, James Henstridge wrote:
> On 21/01/2008, Federico Di Gregorio <fog at initd.org> wrote:
>>> Perhaps an argument to xa_commit() would be appropriate here?
>>>
>>>     connection.xa_commit(onephase=True)
>>>
>>> Without the argument, the commit would be considered to be a
>>> ProgrammingError.  That would reduce the chance of programmer error
>>> leading to data corruption.
>> Lets not make an API that has features useful on a single backend. I
>> suppose the necessity for a one-phase commit in a two-phase transaction
>> is rare. A simple API means early adoption by most of the adapters.
> 
> Well, Postgres lets you commit a 2PC transaction before preparing it
> too (after all, it doesn't know you are using 2PC until you prepare).
> 
> Judging by the kinterbasdb and cx_Oracle code, they can do so as well.
>  This isn't just a "single backend" feature.

Mixing one-phase and two-phase commits sounds like mixing two
concepts that don't belong together, IMHO.

It would be too easy for an application to issue a .commit()
somewhere and thereby breaking the whole two phase commit
idea.

I'd rather like to see the two concepts well separated and
exceptions raised if you try to mix them.

After all, you could still open a second connection if you
need one phase transactions for some other purpose.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 21 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Mon Jan 21 13:36:25 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 21:36:25 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <47948DCD.6000900@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>
	<47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net>
	<479488A2.5000603@egenix.com>
	<a7e835d40801210410i3a80b959ib859d9d7abec4e7d@mail.gmail.com>
	<47948DCD.6000900@egenix.com>
Message-ID: <a7e835d40801210436p464c0d5ds80692dccf63d570d@mail.gmail.com>

On 21/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> Sorry, I wasn't clear enough:
>
> If a connection fails and the transaction XID persists, how do you:
>
>  * identify which XIDs are still pending (xa_recover)
>
>  * tell the RM to drop all resources associacted with an XID
>    (xa_forget)
>
> once the TM has reconnected. These APIs appear to be needed
> in order for the TM to be able to cleanup the RM after e.g.
> a lost connection.
>
> OTOH, perhaps just doing a rollback with the known XID and
> ignoring any errors would do the same without the need for
> extra APIs.

There is nothing in the proposal I sent about recovery as I considered
it out of scope for the initial API.  Given the interest, it is
probably worth adding.

Finding out about outstanding transactions could be done with a
Connection.xa_recover() method that returns a list of transaction IDs.
 In PostgreSQL this can be implemented with "SELECT gid from
pg_prepared_xacts".  For MySQL it can be implemented with "XA
RECOVER".  I don't know about others.

For the xa_forget() call, does it differ from rolling back a prepared
transaction?

James.

From stuart at stuartbishop.net  Mon Jan 21 14:00:06 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Mon, 21 Jan 2008 20:00:06 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <47948EB5.108@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<1200911335.4685.24.camel@mila.office.dinunzioedigregorio>	<a7e835d40801210309p5923242bvd5acd5aaf1482c27@mail.gmail.com>	<1200914177.4685.43.camel@mila.office.dinunzioedigregorio>	<a7e835d40801210335v5eef1dc9v9eccf74e4780b87a@mail.gmail.com>	<1200916385.4685.48.camel@mila.office.dinunzioedigregorio>	<a7e835d40801210412h625d923doa9bcfb3bbe6a74a@mail.gmail.com>
	<47948EB5.108@egenix.com>
Message-ID: <47949756.8010400@stuartbishop.net>

M.-A. Lemburg wrote:

> Mixing one-phase and two-phase commits sounds like mixing two
> concepts that don't belong together, IMHO.
> 
> It would be too easy for an application to issue a .commit()
> somewhere and thereby breaking the whole two phase commit
> idea.

I'm not sure this is worth worrying about - applications can screw things up
right now by issuing COMMITs or ROLLBACKS when shouldn't.

> I'd rather like to see the two concepts well separated and
> exceptions raised if you try to mix them.
> 
> After all, you could still open a second connection if you
> need one phase transactions for some other purpose.

At the start of a transaction, you might not know that only one of your data
stores is going to be modified. Two phase commit imposes an overhead which
can be avoided if only one of your data stores turns out to need changes. I
believe this is why in PostgreSQL you declare you are using 2PC at the end
of your transaction and why MySQL offers you the XA COMMIT xid ONE PHASE option.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/625ae5f0/attachment.pgp 

From james at jamesh.id.au  Mon Jan 21 15:40:23 2008
From: james at jamesh.id.au (James Henstridge)
Date: Mon, 21 Jan 2008 23:40:23 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
Message-ID: <a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>

On 21/01/2008, James Henstridge <james at jamesh.id.au> wrote:
> On 18/01/2008, James Henstridge <james at jamesh.id.au> wrote:
> > So is there any recommendations for what a two-phase commit API should
> > look like?
>
> I did a bit of investigation into a few databases, and came up with a
> proposal for an extension to the DB-API.

Here is an updated version of the proposal.  It removes the analysis
of the different databases, and updates the proposed API to match what
we've been discussing here.

I've added a section about what the "xid" arguments to the various
methods should look like.  That could probably do with some more
discussion as I am not too sure about it.

I've also included support for transaction recovery in the form of an
xa_recover() method and calling the xa_commit()/xa_rollback() methods
with a transaction ID as an argument.

James.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: two-phase-commit-v2.txt
Url: http://mail.python.org/pipermail/db-sig/attachments/20080121/82a9fb5f/attachment-0001.txt 

From dieter at handshake.de  Mon Jan 21 18:46:54 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Mon, 21 Jan 2008 18:46:54 +0100
Subject: [DB-SIG] Any standard for two phase commit APIs?
In-Reply-To: <a7e835d40801202100h22920ac9wb33b348d50cc7a30@mail.gmail.com>
References: <a7e835d40801180037t5d89845cl932cd5795fd9eacd@mail.gmail.com>
	<479071E5.7000000@egenix.com> <47908237.6010001@taupro.com>
	<47908D97.1080304@egenix.com>
	<18320.65237.930087.237768@gargle.gargle.HOWL>
	<a7e835d40801202100h22920ac9wb33b348d50cc7a30@mail.gmail.com>
Message-ID: <18324.55950.269176.896724@gargle.gargle.HOWL>

James Henstridge wrote at 2008-1-21 14:00 +0900:
>On 19/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
>> It does this indeed.
>>
>> And it assumes that a resource manager accepts a vote only
>> when it can garantee that the subsequent "commit" will succeed (and
>> does not fail).
>>
>> A resource manager needs to expose both a "vote" (with the above garantee)
>> and a "commit" in order to be a first class participant of
>> Zope's transaction system.
>>
>> Relational database interfaces often lack the equivalent of a "vote".
>
>I'd disagree with this description.  From the Zope transaction
>documentation, the order of methods is:
>
>    tpc_begin commit tpc_vote (tpc_finish | tpc_abort)
>
>>From the descriptions of the various methods, a database adapter
>supporting 2PC would prepare the transaction at commit(), and commit
>or rollback that transaction in tpc_finish or tpc_abort respectively.
>
>After preparing the transaction, the transaction should be committable
>under normal circumstances, so it would have no reason to vote no as
>part of tpc_vote().
>
>I disagree that the lack of a tpc_vote() method makes the database
>adapter a second class citizen: it simply reflects the fact that the
>adapter makes up its mind at the commit() stage independent of what
>other data managers do.

I agree with you.

The distinction between "commit" and "vote" is probably only
for historical reasons:

  Formerly, "objects" registered with the transaction, not
  resource managers.

  In the "commit", the registered objects where individually
  committed, then the "resource managers" where asked for their
  vote.

Nowadays, resource manager register with the transaction
and we can freely move functions between "commit" and "vote".

What should be clear: despite its name "commit" is not a
true commit, neither is "commit" followed by "vote". Both
together need to prepare the commit which must eventually
succeed when "finish" is called.


-- 
Dieter

From dieter at handshake.de  Mon Jan 21 19:36:29 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Mon, 21 Jan 2008 19:36:29 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
Message-ID: <18324.58925.102352.666954@gargle.gargle.HOWL>

James Henstridge wrote at 2008-1-21 23:40 +0900:
> ...
>= DB-API Two-Phase Commit =
>
>Many databases have support for two-phase commit.  Adapters for some
>of these databases expose this support, but often through mutually
>incompatible extensions to the DB-API standard.
>
>Standardising the API for two-phase commit would make it easier for
>applications and libraries to support two-phase commit with multiple
>databases.
>
>
>== Connection Methods ==
>
>A database adapter that supports two phase commit (2PC) shall provide
>the following additional methods on its connection object:
>
>    .xa_begin(xid)
>
>        Begins a 2PC transaction with the given ID.  This method
>        should be called outside of a transaction (i.e. nothing may
>        have executed since the last .commit() or .rollback()).
>
>        Furthermore, it is an error to call .commit() or .rollback()
>        within the 2PC transaction (what error?).
>
>        If the database does not support 2PC, a NotSupportedError will
>        be raised.
>
>    .xa_prepare()
>
>        Performs the first phase of a transaction started with
>        xa_begin().  It is an error to call this method outside of a
>        2PC transaction.
>
>        After calling xa_prepare(), no statements can be executed
>        until xa_commit() or xa_rollback() have been called.
>
>    .xa_commit(xid=None, onephase=False)
>
>        When called with no arguments, xa_commit() commits a 2PC
>        transaction previously prepared with xa_prepare().
>
>        When called as xa_commit(onephase=True), it may be used to
>        commit the transaction prior to calling xa_prepare().  This
>        may occur if only a single resource ends up participating in
>        the global transaction.
>
>        When called as xa_commit(xid), it commits the given
>        transaction.  If an invalid transaction ID is provided, a
>        DatabaseError will be raised.  This form should be called
>        outside of a transaction, and is intended for use in recovery.
>
>        On return, the 2PC transaction is ended.
>
>    .xa_rollback(xid=None)
>
>        When called with no arguments, xa_rollback() rolls back a 2PC
>        transaction.  It may be called before or after xa_prepare().
>
>        When called as xa_commit(xid), it rolls back the given
>        transaction.  If an invalid transaction ID is provided, a
>        DatabaseError will be raised.  This form should be called
>        outside of a transaction, and is intended for use in recovery.
>
>        On return, the 2PC transaction is ended.
>
>    .xa_recover()
>
>        Returns a list of pending transaction IDs suitable for use
>        with xa_commit(xid) or xa_rollback(xid).
>
>        If the database does not support transaction recovery, it may
>        return an empty list or NotSupportedError.

I would prefer, if

  *  "xa_begin" would be optional

     the current DB API performs automatic "begin" when there is a
     need for it.

  *  the transaction id be chosen automatically, optinally guided
     by "Connection" configuration (to obtain "readable" transaction ids)

  *  the use of "prepare_transaction" triggers a two phase commit --
     otherwise a one phase commit is used.


-- 
Dieter

From james at jamesh.id.au  Tue Jan 22 00:17:18 2008
From: james at jamesh.id.au (James Henstridge)
Date: Tue, 22 Jan 2008 08:17:18 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <18324.58925.102352.666954@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<18324.58925.102352.666954@gargle.gargle.HOWL>
Message-ID: <a7e835d40801211517y64ff56fdk1e1db263315f88a0@mail.gmail.com>

On 22/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
> James Henstridge wrote at 2008-1-21 23:40 +0900:
> > ...
> >= DB-API Two-Phase Commit =
> >
> >Many databases have support for two-phase commit.  Adapters for some
> >of these databases expose this support, but often through mutually
> >incompatible extensions to the DB-API standard.
> >
> >Standardising the API for two-phase commit would make it easier for
> >applications and libraries to support two-phase commit with multiple
> >databases.
> >
> >
> >== Connection Methods ==
> >
> >A database adapter that supports two phase commit (2PC) shall provide
> >the following additional methods on its connection object:
> >
> >    .xa_begin(xid)
> >
> >        Begins a 2PC transaction with the given ID.  This method
> >        should be called outside of a transaction (i.e. nothing may
> >        have executed since the last .commit() or .rollback()).
> >
> >        Furthermore, it is an error to call .commit() or .rollback()
> >        within the 2PC transaction (what error?).
> >
> >        If the database does not support 2PC, a NotSupportedError will
> >        be raised.
> >
> >    .xa_prepare()
> >
> >        Performs the first phase of a transaction started with
> >        xa_begin().  It is an error to call this method outside of a
> >        2PC transaction.
> >
> >        After calling xa_prepare(), no statements can be executed
> >        until xa_commit() or xa_rollback() have been called.
> >
> >    .xa_commit(xid=None, onephase=False)
> >
> >        When called with no arguments, xa_commit() commits a 2PC
> >        transaction previously prepared with xa_prepare().
> >
> >        When called as xa_commit(onephase=True), it may be used to
> >        commit the transaction prior to calling xa_prepare().  This
> >        may occur if only a single resource ends up participating in
> >        the global transaction.
> >
> >        When called as xa_commit(xid), it commits the given
> >        transaction.  If an invalid transaction ID is provided, a
> >        DatabaseError will be raised.  This form should be called
> >        outside of a transaction, and is intended for use in recovery.
> >
> >        On return, the 2PC transaction is ended.
> >
> >    .xa_rollback(xid=None)
> >
> >        When called with no arguments, xa_rollback() rolls back a 2PC
> >        transaction.  It may be called before or after xa_prepare().
> >
> >        When called as xa_commit(xid), it rolls back the given
> >        transaction.  If an invalid transaction ID is provided, a
> >        DatabaseError will be raised.  This form should be called
> >        outside of a transaction, and is intended for use in recovery.
> >
> >        On return, the 2PC transaction is ended.
> >
> >    .xa_recover()
> >
> >        Returns a list of pending transaction IDs suitable for use
> >        with xa_commit(xid) or xa_rollback(xid).
> >
> >        If the database does not support transaction recovery, it may
> >        return an empty list or NotSupportedError.
>
> I would prefer, if
>
>   *  "xa_begin" would be optional
>
>      the current DB API performs automatic "begin" when there is a
>      need for it.

Please see the notes I wrote about the requirements for 2PC in various
databases.  For some of them, there is a different set of commands to
start a normal transaction and a 2PC transaction.  Going with implicit
begin would lead to ambiguity about what sort of transaction to start.


>   *  the transaction id be chosen automatically, optinally guided
>      by "Connection" configuration (to obtain "readable" transaction ids)

Note that transaction IDs are per-transaction rather than
per-connection, and usually assigned by the transaction manager (so
that there is a common portion for the IDs of all participating
resources).  I don't think a connection-time setting will cut it.


>   *  the use of "prepare_transaction" triggers a two phase commit --
>      otherwise a one phase commit is used.

As mentioned earlier, some of the databases want to know that 2PC is a
possibility at the start of the transaction.  As setting up a 2PC
transaction is more expensive, you probably don't want to enable them
in cases where they won't be used.

Deferring the decision until the prepare() stage essentially forces
the application to pay the price with some databases.

James.

From stuart at stuartbishop.net  Tue Jan 22 09:48:50 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Tue, 22 Jan 2008 15:48:50 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
Message-ID: <4795ADF2.1040109@stuartbishop.net>

James Henstridge wrote:

> Here is an updated version of the proposal.  It removes the analysis
> of the different databases, and updates the proposed API to match what
> we've been discussing here.

I'm happy with the design.

I personally don't think we should use the xa prefix, as this will make
people think that this is an XA interface when it isn't - an XA like API
would be something built on top of this.

I would think the following would be better names:
    con.begin_prepared(xid=None)
    con.prepare_transaction()
    con.rollback_prepared(xid=None)
    con.commit_prepared(xid=None)
    con.list_prepared()

> I've added a section about what the "xid" arguments to the various
> methods should look like.  That could probably do with some more
> discussion as I am not too sure about it.
> 
> I've also included support for transaction recovery in the form of an
> xa_recover() method and calling the xa_commit()/xa_rollback() methods
> with a transaction ID as an argument.

It seems that the formatID is unnecessary and just a requirement of the XA C
interface. Also, the xid() method you propose should be camelcase to match
the other type constructors, so Xid(gtrid, bqual=None) or
TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method
returns TransactionId objects they can contain platform specific information
too which is great (username, prepared timestamp & database for PostgreSQL
for instance).


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/76ccc0ca/attachment.pgp 

From fog at initd.org  Tue Jan 22 10:42:51 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 22 Jan 2008 10:42:51 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <4795ADF2.1040109@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795ADF2.1040109@stuartbishop.net>
Message-ID: <1200994971.4091.11.camel@mila.office.dinunzioedigregorio>


Il giorno mar, 22/01/2008 alle 15.48 +0700, Stuart Bishop ha scritto:
> 
> I'm happy with the design.
> 
I am very happy too, but..
> 
> I would think the following would be better names:
>     con.begin_prepared(xid=None)
>     con.prepare_transaction()
>     con.rollback_prepared(xid=None)
>     con.commit_prepared(xid=None)
>     con.list_prepared()

I don't like "prepared" because can be interpreted as "prepared
transaction" as as "prepared statement". I'd like something that won't
confuse users. I agree that xa is too specific. That leaves us with the
long prepared_transaction or something generic like twophase (or 2pc or
tpc_prefix, or?)

begin_prepared_transaction()
prepare_transaction()
rollback_prepared_transaction()
...

begin_2pc_transaction()
prepare_2pc_transaction()
rollback_2pc_transaction()
...

begin_twophase()
prepare_twophase()
rollback_twophase()
...

tpc_begin()
tpc_prepare()
tpc_rollback()

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
              All programmers are optimists. -- Frederick P. Brooks, Jr.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/19928c9d/attachment.pgp 

From mal at egenix.com  Tue Jan 22 11:34:54 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 11:34:54 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
Message-ID: <4795C6CE.8060301@egenix.com>

On 2008-01-21 15:40, James Henstridge wrote:
> On 21/01/2008, James Henstridge <james at jamesh.id.au> wrote:
>> On 18/01/2008, James Henstridge <james at jamesh.id.au> wrote:
>>> So is there any recommendations for what a two-phase commit API should
>>> look like?
>> I did a bit of investigation into a few databases, and came up with a
>> proposal for an extension to the DB-API.
> 
> Here is an updated version of the proposal.  It removes the analysis
> of the different databases, and updates the proposed API to match what
> we've been discussing here.
> 
> I've added a section about what the "xid" arguments to the various
> methods should look like.  That could probably do with some more
> discussion as I am not too sure about it.
> 
> I've also included support for transaction recovery in the form of an
> xa_recover() method and calling the xa_commit()/xa_rollback() methods
> with a transaction ID as an argument.

Thanks. I like it a lot, except for making the XID an object - this
always appears to be a string in all the backends you've checked and
also in the XA standard, so I'd go for a simple string instead of
an object (those are always lots of work to do at C level).

Regarding the "xa_" prefix, I'm not much attached to it, but since
the interface does indeed look a lot like the XA interface, why not
make that reference ?

It also makes it clear, that the interface
sits on top of the standard DB-API connection API and that those
methods form a unit.

Plus they are currently not in use by any DB-API module, so don't
interfere with existing APIs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Tue Jan 22 12:33:39 2008
From: james at jamesh.id.au (James Henstridge)
Date: Tue, 22 Jan 2008 20:33:39 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <4795C6CE.8060301@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
Message-ID: <a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>

On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> Thanks. I like it a lot, except for making the XID an object - this
> always appears to be a string in all the backends you've checked and
> also in the XA standard, so I'd go for a simple string instead of
> an object (those are always lots of work to do at C level).

In at least MySQL and Oracle, the transaction ID appears to be more
than just a string: it is structured into three parts:
 * a format ID
 * a global transaction ID
 * a branch qualifier

Stuart has made the argument that the format ID is not important for
Python, and I tend to agree (or at least I don't know what situations
you'd use it).

I do see a use for the branch qualifier though.  In a distributed
transaction, each resource should have a different transaction ID that
share a common global transaction ID but separate branch qualifiers.

As transaction IDs are global within database clusters for some
backends (PostgreSQL, MySQL and probably others), the branch qualifier
is necessary if two databases from the cluster are used in the global
transaction.

I think it is worth making the API such that it is easy to program to
best practices.


> Regarding the "xa_" prefix, I'm not much attached to it, but since
> the interface does indeed look a lot like the XA interface, why not
> make that reference ?

Stuart's argument is that if the API differs from XA then using the
xa_* prefix could be problematic for adapters that want to expose the
XA API.

As I don't have any experience with using XA, I can't comment one way
or the other about this.


> It also makes it clear, that the interface
> sits on top of the standard DB-API connection API and that those
> methods form a unit.

Having a common prefix seems sensible.  If we don't use xa_*,
Federico's suggestion of tpc_* might make sense.


> Plus they are currently not in use by any DB-API module, so don't
> interfere with existing APIs.

So I guess it comes down to the following questions:
1. Are database adapters likely to want to expose more than what is
covered by this proposal?
2. Would this proposed API conflict with those extensions?

It isn't clear to me that people want to provide a larger API, since
the few adapters that have added 2PC support have done so with APIs
that are effectively a subset/simplification of this one.

James.

From james at jamesh.id.au  Tue Jan 22 12:36:17 2008
From: james at jamesh.id.au (James Henstridge)
Date: Tue, 22 Jan 2008 20:36:17 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <4795ADF2.1040109@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795ADF2.1040109@stuartbishop.net>
Message-ID: <a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>

On 22/01/2008, Stuart Bishop <stuart at stuartbishop.net> wrote:
> It seems that the formatID is unnecessary and just a requirement of the XA C
> interface. Also, the xid() method you propose should be camelcase to match
> the other type constructors, so Xid(gtrid, bqual=None) or
> TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method
> returns TransactionId objects they can contain platform specific information
> too which is great (username, prepared timestamp & database for PostgreSQL
> for instance).

Well, the DB-API does not actually expose any classes other than the
exceptions.  The primary objects you work with are all created by
factory functions/methods:

 * Connections from module.connect()
 * Cursors from connection.cursor()

I was suggesting that transaction ID objects be created by either a
module.xid() or connection.xid() factory function and not make the
class object part of the API.

James.

From mal at egenix.com  Tue Jan 22 12:56:20 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 12:56:20 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
Message-ID: <4795D9E4.2030509@egenix.com>

On 2008-01-22 12:33, James Henstridge wrote:
> On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
>> Thanks. I like it a lot, except for making the XID an object - this
>> always appears to be a string in all the backends you've checked and
>> also in the XA standard, so I'd go for a simple string instead of
>> an object (those are always lots of work to do at C level).
> 
> In at least MySQL and Oracle, the transaction ID appears to be more
> than just a string: it is structured into three parts:
>  * a format ID
>  * a global transaction ID
>  * a branch qualifier
> 
> Stuart has made the argument that the format ID is not important for
> Python, and I tend to agree (or at least I don't know what situations
> you'd use it).

The format id is only used to specify the format of the data
structure in the XA xid_struct_t:

>From http://www.opengroup.org/onlinepubs/009680699/toc.pdf:

"""
Although "xa.h" constrains the length and byte alignment of the data element within an
XID, it does not specify the data's contents. The only requirement is that both gtrid and
bqual, taken together, must be globally unique. The recommended way of achieving
global uniqueness is to use the naming rules specified for OSI CCR atomic action
identifiers (see the referenced OSI CCR specification). If OSI CCR naming is used, then
the XID's formatID element should be set to 0; if some other format is used, then the
formatID element should be greater than 0. A value of -1 in formatID means that the
XID is null.
The RM must be able to map the XID to the recoverable work it did for the
corresponding branch. RMs may perform bitwise comparisons on the data
components of an XID for the lengths specified in the XID structure. Most XA routines
pass a pointer to the XID. These pointers are valid only for the duration of the call. If
the RM needs to refer to the XID after it returns from the call, it must make a local copy
before returning.
/*
* Transaction branch identification: XID and NULLXID:
*/
#define XIDDATASIZE 128 /* size in bytes */
#define MAXGTRIDSIZE 64 /* maximum size in bytes of gtrid */
#define MAXBQUALSIZE 64 /* maximum size in bytes of bqual */
struct xid_t {
long formatID; /* format identifier */
long gtrid_length; /* value 1-64 */
long bqual_length; /* value 1-64 */
char data[XIDDATASIZE];
};
typedef struct xid_t XID;
"""

So, essentially, only the global transaction id and the branch id
are relevant and both are represented in the data string.

BTW, there's a nice extension module that let's you hook Python
between the TM and RM using XA:

    http://www.hare.demon.co.uk/pyxasw/

> I do see a use for the branch qualifier though.  In a distributed
> transaction, each resource should have a different transaction ID that
> share a common global transaction ID but separate branch qualifiers.
> 
> As transaction IDs are global within database clusters for some
> backends (PostgreSQL, MySQL and probably others), the branch qualifier
> is necessary if two databases from the cluster are used in the global
> transaction.
> 
> I think it is worth making the API such that it is easy to program to
> best practices.

The DB-API has always tried to not get in the way of how
a particular backends needs its configuration data, so
I think we can still have a single string using a database
backend specific format. This could then include one or more
of the above id parts.

The implementation can then decode the string representation
of the transaction id components into whatever format is
needed by the backend.

>> Regarding the "xa_" prefix, I'm not much attached to it, but since
>> the interface does indeed look a lot like the XA interface, why not
>> make that reference ?
> 
> Stuart's argument is that if the API differs from XA then using the
> xa_* prefix could be problematic for adapters that want to expose the
> XA API.
> 
> As I don't have any experience with using XA, I can't comment one way
> or the other about this.

Fair enough. The API does resemble XA a lot, but you're right:
if there are differences, it's better not to make that link.

>> It also makes it clear, that the interface
>> sits on top of the standard DB-API connection API and that those
>> methods form a unit.
> 
> Having a common prefix seems sensible.  If we don't use xa_*,
> Federico's suggestion of tpc_* might make sense.

Fine, let's use "tpc_".

>> Plus they are currently not in use by any DB-API module, so don't
>> interfere with existing APIs.
> 
> So I guess it comes down to the following questions:
> 1. Are database adapters likely to want to expose more than what is
> covered by this proposal?
> 2. Would this proposed API conflict with those extensions?
> 
> It isn't clear to me that people want to provide a larger API, since
> the few adapters that have added 2PC support have done so with APIs
> that are effectively a subset/simplification of this one.

If there's more to expose than what's in the API spec, then
module authors are free to do so.

In general, the DB-API only
defines a fully functional common subset of what has to be
there to use a database backend. Extensions are possible and
welcome.

Every now and then, we consider adding those extensions as
"standard extensions" to the DB-API. This has proven to work
well in the past.

The two-phase commit methods would be another set of those
extensions.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From stuart at stuartbishop.net  Tue Jan 22 13:34:44 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Tue, 22 Jan 2008 19:34:44 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	
	<4795ADF2.1040109@stuartbishop.net>
	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
Message-ID: <4795E2E4.4010504@stuartbishop.net>

James Henstridge wrote:
> On 22/01/2008, Stuart Bishop <stuart at stuartbishop.net> wrote:
>> It seems that the formatID is unnecessary and just a requirement of the XA C
>> interface. Also, the xid() method you propose should be camelcase to match
>> the other type constructors, so Xid(gtrid, bqual=None) or
>> TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method
>> returns TransactionId objects they can contain platform specific information
>> too which is great (username, prepared timestamp & database for PostgreSQL
>> for instance).
> 
> Well, the DB-API does not actually expose any classes other than the
> exceptions.  The primary objects you work with are all created by
> factory functions/methods:

The camelcase suggestion was to match the other type constructors as
documented under "Type Objects & Constructors", such as Date, Time,
Timestamp, Binary.

>  * Connections from module.connect()
>  * Cursors from connection.cursor()
> 
> I was suggesting that transaction ID objects be created by either a
> module.xid() or connection.xid() factory function and not make the
> class object part of the API.

Sure - the class object doesn't need to be part of the API, but xa_recover
needs to return a list of something and the behaviour of those somethings
needs to be defined. I imagined that would be an object providing
.transaction_id & .branch_qualifier at a minimum, and the driver can add in
whatever platform specific attributes or behaviour it wants. The xid objects
can't be opaque as a transaction manager needs to be able to filter out the
relevant from irrelevant.

(From the other threads, I'm happy with tpc_ naming).

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/ed0f52ec/attachment.pgp 

From mal at egenix.com  Tue Jan 22 13:42:06 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 13:42:06 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <4795E2E4.4010504@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>		<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>		<4795ADF2.1040109@stuartbishop.net>	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
	<4795E2E4.4010504@stuartbishop.net>
Message-ID: <4795E49E.9030900@egenix.com>

On 2008-01-22 13:34, Stuart Bishop wrote:
> James Henstridge wrote:
>> On 22/01/2008, Stuart Bishop <stuart at stuartbishop.net> wrote:
>>> It seems that the formatID is unnecessary and just a requirement of the XA C
>>> interface. Also, the xid() method you propose should be camelcase to match
>>> the other type constructors, so Xid(gtrid, bqual=None) or
>>> TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method
>>> returns TransactionId objects they can contain platform specific information
>>> too which is great (username, prepared timestamp & database for PostgreSQL
>>> for instance).
>> Well, the DB-API does not actually expose any classes other than the
>> exceptions.  The primary objects you work with are all created by
>> factory functions/methods:
> 
> The camelcase suggestion was to match the other type constructors as
> documented under "Type Objects & Constructors", such as Date, Time,
> Timestamp, Binary.
> 
>>  * Connections from module.connect()
>>  * Cursors from connection.cursor()
>>
>> I was suggesting that transaction ID objects be created by either a
>> module.xid() or connection.xid() factory function and not make the
>> class object part of the API.
> 
> Sure - the class object doesn't need to be part of the API, but xa_recover
> needs to return a list of something and the behaviour of those somethings
> needs to be defined. 

It only needs to be defined in the context of the module exposing
that recover API, since you'd only pass it back to the methods of
that same API.

We could just describe the transaction id as object in the spec and
then have the modules decide what type this maps to, e.g. one module
might want to use a tuple (or even namedtuple) for this, another
might not want to bother at all and use the internal representation
mapped to a string or bytes object.

> I imagined that would be an object providing
> .transaction_id & .branch_qualifier at a minimum, and the driver can add in
> whatever platform specific attributes or behaviour it wants. The xid objects
> can't be opaque as a transaction manager needs to be able to filter out the
> relevant from irrelevant.
> 
> (From the other threads, I'm happy with tpc_ naming).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From stuart at stuartbishop.net  Tue Jan 22 14:09:58 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Tue, 22 Jan 2008 20:09:58 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <4795E49E.9030900@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>		<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>		<4795ADF2.1040109@stuartbishop.net>	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
	<4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com>
Message-ID: <4795EB26.7090300@stuartbishop.net>

M.-A. Lemburg wrote:

> It only needs to be defined in the context of the module exposing
> that recover API, since you'd only pass it back to the methods of
> that same API.
> 
> We could just describe the transaction id as object in the spec and
> then have the modules decide what type this maps to, e.g. one module
> might want to use a tuple (or even namedtuple) for this, another
> might not want to bother at all and use the internal representation
> mapped to a string or bytes object.


From the XA pdf you linked to earlier on xa_recover:

    "A transaction manager calls xa_recover() during recovery to obtain a
list of transaction branches that are currently in a prepared or
heuristically completed state.

    [...]

    "It is the transaction manager?s responsibility to ignore XIDs that do
not belong to it.

So if you where to implement an XA like interface around this, how can a
transaction manager filter out the irrelevant XIDs if is cannot interrogate
them?

If behaviour of the xids returned by tpc_recover is not defined, we need
another method to decompose an xid into its global transaction id and its
branch id.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/52a62aeb/attachment.pgp 

From mal at egenix.com  Tue Jan 22 14:23:12 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 14:23:12 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <4795EB26.7090300@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>		<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>		<4795ADF2.1040109@stuartbishop.net>	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>	<4795E2E4.4010504@stuartbishop.net>
	<4795E49E.9030900@egenix.com> <4795EB26.7090300@stuartbishop.net>
Message-ID: <4795EE40.4040704@egenix.com>

On 2008-01-22 14:09, Stuart Bishop wrote:
> M.-A. Lemburg wrote:
> 
>> It only needs to be defined in the context of the module exposing
>> that recover API, since you'd only pass it back to the methods of
>> that same API.
>>
>> We could just describe the transaction id as object in the spec and
>> then have the modules decide what type this maps to, e.g. one module
>> might want to use a tuple (or even namedtuple) for this, another
>> might not want to bother at all and use the internal representation
>> mapped to a string or bytes object.
> 
> 
> From the XA pdf you linked to earlier on xa_recover:
> 
>     "A transaction manager calls xa_recover() during recovery to obtain a
> list of transaction branches that are currently in a prepared or
> heuristically completed state.
> 
>     [...]
> 
>     "It is the transaction manager's responsibility to ignore XIDs that do
> not belong to it.
> 
> So if you where to implement an XA like interface around this, how can a
> transaction manager filter out the irrelevant XIDs if is cannot interrogate
> them?

Good point, but I actually think that this refers to the TM storing
the XIDs it knows about and ignoring any other XIDs returned by
the recover method.

I don't think that the TM is required to understand the format of
the XID since the resource managers fill in that data and only
they have to be able to recognize it.

Then again, it may be useful for other purposes.

Since there are only two id components that appear to be relevant,
how about using a 2-tuple for the transaction id ?

> If behaviour of the xids returned by tpc_recover is not defined, we need
> another method to decompose an xid into its global transaction id and its
> branch id.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From fog at initd.org  Tue Jan 22 14:27:20 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Tue, 22 Jan 2008 14:27:20 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <4795EE40.4040704@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795ADF2.1040109@stuartbishop.net>
	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
	<4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com>
	<4795EB26.7090300@stuartbishop.net>  <4795EE40.4040704@egenix.com>
Message-ID: <1201008440.4091.20.camel@mila.office.dinunzioedigregorio>


Il giorno mar, 22/01/2008 alle 14.23 +0100, M.-A. Lemburg ha scritto:
> Since there are only two id components that appear to be relevant,
> how about using a 2-tuple for the transaction id ?

...and modules that want to use a custom object can always implement the
tuple interface and stay compatible with the API.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
 Il panda ha l'apparato digerente di un carnivoro (e.g., di un orso).
  Il panda ha scelto di cibarsi esclusivamente di germogli di bamb?.
  Quindi, il panda ? l'unico animale vegano del pianeta. Il panda
  merita di estinguersi.                       -- Maria, Alice, Federico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/0191b288/attachment.pgp 

From dieter at handshake.de  Tue Jan 22 19:52:10 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Tue, 22 Jan 2008 19:52:10 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <4795E49E.9030900@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795ADF2.1040109@stuartbishop.net>
	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>
	<4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com>
Message-ID: <18326.15194.365181.823524@gargle.gargle.HOWL>

M.-A. Lemburg wrote at 2008-1-22 13:42 +0100:
> ...
>We could just describe the transaction id as object in the spec and
>then have the modules decide what type this maps to, e.g. one module
>might want to use a tuple (or even namedtuple) for this, another
>might not want to bother at all and use the internal representation
>mapped to a string or bytes object.

I learned (from James remark) that transaction ids belong to the
transaction manager and not the resource.

Thus, at least the individual "drivers" should not use different
implementations for transaction ids.


-- 
Dieter

From mal at egenix.com  Tue Jan 22 20:26:00 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 20:26:00 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any	standard
 for two phase commit APIs?)
In-Reply-To: <18326.15194.365181.823524@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795ADF2.1040109@stuartbishop.net>	<a7e835d40801220336h6c3767adj65eeebf7cd3cc761@mail.gmail.com>	<4795E2E4.4010504@stuartbishop.net>
	<4795E49E.9030900@egenix.com>
	<18326.15194.365181.823524@gargle.gargle.HOWL>
Message-ID: <47964348.1050601@egenix.com>

On 2008-01-22 19:52, Dieter Maurer wrote:
> M.-A. Lemburg wrote at 2008-1-22 13:42 +0100:
>> ...
>> We could just describe the transaction id as object in the spec and
>> then have the modules decide what type this maps to, e.g. one module
>> might want to use a tuple (or even namedtuple) for this, another
>> might not want to bother at all and use the internal representation
>> mapped to a string or bytes object.
> 
> I learned (from James remark) that transaction ids belong to the
> transaction manager and not the resource.
> 
> Thus, at least the individual "drivers" should not use different
> implementations for transaction ids.

You're right. I misunderstood which component manages the transaction
id (xid). It's the transaction manager, not the resource manager.
And it's the database modules that must accept whatever the TM
passes them, not the other way around.

Would a tuple (global transaction id, branch id) do the trick or
should we have two parameters on each API instead ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at egenix.com  Tue Jan 22 20:31:14 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 20:31:14 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <18326.14996.49402.907419@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
Message-ID: <47964482.5030309@egenix.com>

On 2008-01-22 19:48, Dieter Maurer wrote:
> James Henstridge wrote at 2008-1-22 20:33 +0900:
>> ...
>> I do see a use for the branch qualifier though.  In a distributed
>> transaction, each resource should have a different transaction ID
> 
> Why?
> Why is it not equally good to use a common transaction id for
> all resource managers?
>
>> that
>> share a common global transaction ID but separate branch qualifiers.
>>
>> As transaction IDs are global within database clusters for some
>> backends (PostgreSQL, MySQL and probably others), the branch qualifier
>> is necessary if two databases from the cluster are used in the global
>> transaction.
> 
> They refer to the same transaction -- even when several databases
> in a cluster are affected.
> 
> The transaction as a whole will want to get prepared, committed, rolledback...

Sections 2.2.5 and 2.2.6 explain why you need a global transaction
id and a branch id as well:

http://www.opengroup.org/onlinepubs/009680699/toc.pdf

Branch ids are used for e.g. multiple connections of the same RM
engaging in a global transaction. Each of those connections gets
its own branch id.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From dieter at handshake.de  Tue Jan 22 20:54:22 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Tue, 22 Jan 2008 20:54:22 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <47964482.5030309@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
	<47964482.5030309@egenix.com>
Message-ID: <18326.18926.708382.939543@gargle.gargle.HOWL>

M.-A. Lemburg wrote at 2008-1-22 20:31 +0100:
> ...
>Branch ids are used for e.g. multiple connections of the same RM
>engaging in a global transaction. Each of those connections gets
>its own branch id.

But using multiple connections to the same RM seems to
be an error in the first place.

  Assume that a resource "R" is locked via connection "C1".
  Assume than that "R" is requested via connection "C2".

  If "C1 == C2", then the RM can see that the resource is already
  assigned to the connection and there is no blocking.

  Otherwise, the RM has not chance to recognize this and
  the request will be blocked until the transaction is commited
  or rolled back. There is quite a high chance, that since the
  "R" request is blocked, there will be no commit/roll back....


-- 
Dieter

From mal at egenix.com  Tue Jan 22 22:46:24 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jan 2008 22:46:24 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <18326.18926.708382.939543@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<18326.14996.49402.907419@gargle.gargle.HOWL>	<47964482.5030309@egenix.com>
	<18326.18926.708382.939543@gargle.gargle.HOWL>
Message-ID: <47966430.7010005@egenix.com>

On 2008-01-22 20:54, Dieter Maurer wrote:
> M.-A. Lemburg wrote at 2008-1-22 20:31 +0100:
>> ...
>> Branch ids are used for e.g. multiple connections of the same RM
>> engaging in a global transaction. Each of those connections gets
>> its own branch id.
> 
> But using multiple connections to the same RM seems to
> be an error in the first place.
> 
>   Assume that a resource "R" is locked via connection "C1".
>   Assume than that "R" is requested via connection "C2".
> 
>   If "C1 == C2", then the RM can see that the resource is already
>   assigned to the connection and there is no blocking.
> 
>   Otherwise, the RM has not chance to recognize this and
>   the request will be blocked until the transaction is commited
>   or rolled back. There is quite a high chance, that since the
>   "R" request is blocked, there will be no commit/roll back....

This situation is well possible, but it's still a rather common
case: if an application uses multiple threads, then each of
the threads will have its own connection and branch id.

It's less common in the Python world (well, maybe for Zope),
but very common in Java and C++ applications.

Note that it's also possible that even though a connection
is registered with the TM, the current global transaction
doesn't affect it (e.g. because it's not executing anything
at the time). It can then optimize the .tpc_commit()/
.tpc_rollback() method call (by ignoring them).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Wed Jan 23 02:18:48 2008
From: james at jamesh.id.au (James Henstridge)
Date: Wed, 23 Jan 2008 10:18:48 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <4795D9E4.2030509@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<4795D9E4.2030509@egenix.com>
Message-ID: <a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>

On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-22 12:33, James Henstridge wrote:
> > On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> >> Thanks. I like it a lot, except for making the XID an object - this
> >> always appears to be a string in all the backends you've checked and
> >> also in the XA standard, so I'd go for a simple string instead of
> >> an object (those are always lots of work to do at C level).
> >
> > In at least MySQL and Oracle, the transaction ID appears to be more
> > than just a string: it is structured into three parts:
> >  * a format ID
> >  * a global transaction ID
> >  * a branch qualifier
> >
> > Stuart has made the argument that the format ID is not important for
> > Python, and I tend to agree (or at least I don't know what situations
> > you'd use it).
>
> The format id is only used to specify the format of the data
> structure in the XA xid_struct_t:
>
> From http://www.opengroup.org/onlinepubs/009680699/toc.pdf:
>
> """
> Although "xa.h" constrains the length and byte alignment of the data element within an
> XID, it does not specify the data's contents. The only requirement is that both gtrid and
> bqual, taken together, must be globally unique. The recommended way of achieving
> global uniqueness is to use the naming rules specified for OSI CCR atomic action
> identifiers (see the referenced OSI CCR specification). If OSI CCR naming is used, then
> the XID's formatID element should be set to 0; if some other format is used, then the
> formatID element should be greater than 0. A value of -1 in formatID means that the
> XID is null.
> The RM must be able to map the XID to the recoverable work it did for the
> corresponding branch. RMs may perform bitwise comparisons on the data
> components of an XID for the lengths specified in the XID structure. Most XA routines
> pass a pointer to the XID. These pointers are valid only for the duration of the call. If
> the RM needs to refer to the XID after it returns from the call, it must make a local copy
> before returning.
> /*
> * Transaction branch identification: XID and NULLXID:
> */
> #define XIDDATASIZE 128 /* size in bytes */
> #define MAXGTRIDSIZE 64 /* maximum size in bytes of gtrid */
> #define MAXBQUALSIZE 64 /* maximum size in bytes of bqual */
> struct xid_t {
> long formatID; /* format identifier */
> long gtrid_length; /* value 1-64 */
> long bqual_length; /* value 1-64 */
> char data[XIDDATASIZE];
> };
> typedef struct xid_t XID;
> """
>
> So, essentially, only the global transaction id and the branch id
> are relevant and both are represented in the data string.

One interesting part of that is the "If OSI CCR naming is used, then
the XID's formatID element should be set to 0; if some other format is
used, then the formatID element should be greater than 0."

I took a quick look at a few J2EE servers (which use XA), to see what
they do for transaction managers.  Neither JBoss or Geronimo seem to
use formatID=0, but instead use magic numbers that I presume are
intended to determine if they created the transaction ID.

That said, the selection of format identifiers seems a bit ad-hoc:
Geronimo uses 0x4765526f, which has a byte representation of "GeRo".

It seems that you could do pretty much the same thing by getting TMs
to check the global ID itself ...


> BTW, there's a nice extension module that let's you hook Python
> between the TM and RM using XA:
>
>     http://www.hare.demon.co.uk/pyxasw/


>
> > I do see a use for the branch qualifier though.  In a distributed
> > transaction, each resource should have a different transaction ID that
> > share a common global transaction ID but separate branch qualifiers.
> >
> > As transaction IDs are global within database clusters for some
> > backends (PostgreSQL, MySQL and probably others), the branch qualifier
> > is necessary if two databases from the cluster are used in the global
> > transaction.
> >
> > I think it is worth making the API such that it is easy to program to
> > best practices.
>
> The DB-API has always tried to not get in the way of how
> a particular backends needs its configuration data, so
> I think we can still have a single string using a database
> backend specific format. This could then include one or more
> of the above id parts.
>
> The implementation can then decode the string representation
> of the transaction id components into whatever format is
> needed by the backend.

The two reasons I see for using an object to represent transactions
that contains a global part and branch part are:

1. round tripping a transaction ID from xa_recover() to
xa_commit()/xa_rollback().
2. Reduced restrictions on the contents of the transaction ID.

For (1), using a database adapter defined object means that it can
represent transactions that originated elsewhere, or expose more
information about those transactions.

For (2), if a database is using specially formatted transaction IDs at
the Python level that get decoded into the various components, does
that mean that the application or transaction manager glue needs to
know how to format the IDs.

In contrast, it is pretty easy for e.g. a Postgres adapter to
serialise/deserialise a multi-part ID (and this is what the JDBC
driver does).


> >> Regarding the "xa_" prefix, I'm not much attached to it, but since
> >> the interface does indeed look a lot like the XA interface, why not
> >> make that reference ?
> >
> > Stuart's argument is that if the API differs from XA then using the
> > xa_* prefix could be problematic for adapters that want to expose the
> > XA API.
> >
> > As I don't have any experience with using XA, I can't comment one way
> > or the other about this.
>
> Fair enough. The API does resemble XA a lot, but you're right:
> if there are differences, it's better not to make that link.
>
> >> It also makes it clear, that the interface
> >> sits on top of the standard DB-API connection API and that those
> >> methods form a unit.
> >
> > Having a common prefix seems sensible.  If we don't use xa_*,
> > Federico's suggestion of tpc_* might make sense.
>
> Fine, let's use "tpc_".
>
> >> Plus they are currently not in use by any DB-API module, so don't
> >> interfere with existing APIs.
> >
> > So I guess it comes down to the following questions:
> > 1. Are database adapters likely to want to expose more than what is
> > covered by this proposal?
> > 2. Would this proposed API conflict with those extensions?
> >
> > It isn't clear to me that people want to provide a larger API, since
> > the few adapters that have added 2PC support have done so with APIs
> > that are effectively a subset/simplification of this one.
>
> If there's more to expose than what's in the API spec, then
> module authors are free to do so.
>
> In general, the DB-API only
> defines a fully functional common subset of what has to be
> there to use a database backend. Extensions are possible and
> welcome.

I agree with this, and think it is worth keeping extensibility in mind
when designing the API.  My suggestion of using an object to represent
a transaction ID was to make it easier for an adapter to expose more
complex IDs in a fairly localised fashion.


> Every now and then, we consider adding those extensions as
> "standard extensions" to the DB-API. This has proven to work
> well in the past.
>
> The two-phase commit methods would be another set of those
> extensions.

Okay.

James.

From konjkov.vv at gmail.com  Wed Jan 23 04:12:20 2008
From: konjkov.vv at gmail.com (Konjkov Vladimir)
Date: Wed, 23 Jan 2008 10:12:20 +0700
Subject: [DB-SIG] PEP 249
Message-ID: <f6ee47ad0801221912x452da680rfbee1123c214fa89@mail.gmail.com>

in definition of
.execute(operation[,parameters])
.....
A reference to the operation will be retained by the
cursor. If the same operation object is passed in again,
then the cursor can optimize its behavior.

What meens "the same operation object is passed in again"?
There's no definition for Class Operation.

May by it meens

SameOperation = "something that just a constant!"
C = cnxn.cursor()
C.execute("select * from table where a=? and b=?",(1,2))
C.fetchall()
C.execute(SameOperation,(3,4))
C.fetchall()

or not?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20080123/eeb2f373/attachment.htm 

From carsten at uniqsys.com  Wed Jan 23 05:09:05 2008
From: carsten at uniqsys.com (Carsten Haese)
Date: Tue, 22 Jan 2008 23:09:05 -0500
Subject: [DB-SIG] PEP 249
In-Reply-To: <f6ee47ad0801221912x452da680rfbee1123c214fa89@mail.gmail.com>
References: <f6ee47ad0801221912x452da680rfbee1123c214fa89@mail.gmail.com>
Message-ID: <1201061345.3323.4.camel@localhost.localdomain>

On Wed, 2008-01-23 at 10:12 +0700, Konjkov Vladimir wrote:
> in definition of
> .execute(operation[,parameters]) 
> .....
> A reference to the operation will be retained by the
> cursor. If the same operation object is passed in again,
> then the cursor can optimize its behavior.
> 
> 
> What meens "the same operation object is passed in again"?
> There's no definition for Class Operation.
> 
> May by it meens
> 
> SameOperation = "something that just a constant!"
> C = 
> cnxn.cursor()
> C.execute("select * from table where a=? and b=?",(1,2))
> C.fetchall()
> C.execute(SameOperation,(3,4))
> C.fetchall()

No, it means this:

sql = "select * from customers where cust_code = ?"
C.execute(sql, (1,))
# ...
C.execute(sql, (2,))
# ...

Assuming that "sql" doesn't get rebound, it'll reference the same string
object as before, hence the cursor object may optimize its behavior by
reusing a previously prepared statement for that query instead of
re-preparing the statement.

HTH,

-- 
Carsten Haese
http://informixdb.sourceforge.net


From james at jamesh.id.au  Wed Jan 23 05:18:52 2008
From: james at jamesh.id.au (James Henstridge)
Date: Wed, 23 Jan 2008 13:18:52 +0900
Subject: [DB-SIG] PEP 249
In-Reply-To: <f6ee47ad0801221912x452da680rfbee1123c214fa89@mail.gmail.com>
References: <f6ee47ad0801221912x452da680rfbee1123c214fa89@mail.gmail.com>
Message-ID: <a7e835d40801222018m1111e945h5d74be0ad0a9ac66@mail.gmail.com>

On 23/01/2008, Konjkov Vladimir <konjkov.vv at gmail.com> wrote:
> in definition of
> .execute(operation[,parameters])
> .....
> A reference to the operation will be retained by the
> cursor. If the same operation object is passed in again,
> then the cursor can optimize its behavior.

The operation object is the one passed as as the first argument to .execute().


> What meens "the same operation object is passed in again"?
> There's no definition for Class Operation.

It means that if you pass the same object to multiple execute() calls,
the database adapter may optimise things (then again, it might not).

The following is an example based on yours:

   query = "select * from table where a=? and b=?"
   C.execute(query, (1, 2))
   C.execute(query, (3, 4))

So if the adapter uses prepared statements, it can see that the second
execute() call uses the same query so uses the previously prepared
statement.

As an application developer, the thing to take away from this is that
if you are going to execute the same query over an over, consider
using the same string object.

James.

From konjkov.vv at gmail.com  Wed Jan 23 07:14:41 2008
From: konjkov.vv at gmail.com (Konjkov Vladimir)
Date: Wed, 23 Jan 2008 13:14:41 +0700
Subject: [DB-SIG] PEP 249
Message-ID: <f6ee47ad0801222214l581429c5p10d3e4f228d68998@mail.gmail.com>

When I'm implementin on C my Python module that are used to access
ODBC 2.0 database, I can't found description in PEP-0249 about the
case when one .executeXXX follows another on the same cursor object.

I think that after .executeXXX cursor can only be
fetchedXXX or closed. Reexecution permited and raised exception.
That's because .executeXXX method calling SQLPrepare and
and next SQLPrepare posible only when SQLCloseCursor() or
SQLFreeStmt() with the SQL_CLOSE option called.

But on C-level reexecution is posible.

"Once the application has processed the results from the SQLExecute() call,
it can execute the statement again with new (or the same) parameter values."

Problem is that no .Prepare(Statement) method is not present in Cursor
oblect.

I think it will be better if connection method of cursor have to do the
SQLPrepare and only prepare the statemnet when creatin new python cursor
object
C = cnxn.cursor(STATEMENT),
and C.execute([parameters]) will only execute or reexecute the statemnet
with optional parameters list.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/db-sig/attachments/20080123/4a51398e/attachment.htm 

From james at jamesh.id.au  Wed Jan 23 08:31:39 2008
From: james at jamesh.id.au (James Henstridge)
Date: Wed, 23 Jan 2008 16:31:39 +0900
Subject: [DB-SIG] PEP 249
In-Reply-To: <f6ee47ad0801222214l581429c5p10d3e4f228d68998@mail.gmail.com>
References: <f6ee47ad0801222214l581429c5p10d3e4f228d68998@mail.gmail.com>
Message-ID: <a7e835d40801222331g5efa38f5s29afd2bf9ebaa760@mail.gmail.com>

On 23/01/2008, Konjkov Vladimir <konjkov.vv at gmail.com> wrote:
> When I'm implementin on C my Python module that are used to access
>  ODBC 2.0 database, I can't found description in PEP-0249 about the
>  case when one .executeXXX follows another on the same cursor object.
>
>  I think that after .executeXXX cursor can only be
>  fetchedXXX or closed. Reexecution permited and raised exception.
>  That's because .executeXXX method calling SQLPrepare and
>  and next SQLPrepare posible only when SQLCloseCursor() or
>  SQLFreeStmt() with the SQL_CLOSE option called.

The idea is that on .execute(), the database adapter could prepare the
statement and execute it.  The cursor would keep the prepared
statement around afterwards.

On a subsequent .execute() call, if the statement is identical it can
use the previously prepared statement.  If not, then it discards the
prepared statement and creates a new one.


>  But on C-level reexecution is posible.
>
>  "Once the application has processed the results from the SQLExecute() call,
>  it can execute the statement again with new (or the same) parameter
> values."
>
>  Problem is that no .Prepare(Statement) method is not present in Cursor
> oblect.

Use of prepared statements is implicit, if the database adapter uses
them at all.


>  I think it will be better if connection method of cursor have to do the
>  SQLPrepare and only prepare the statemnet when creatin new python cursor
> object
>  C = cnxn.cursor(STATEMENT),
>  and C.execute([parameters]) will only execute or reexecute the statemnet
>  with optional parameters list.

What benefits do you see from this design over the existing one?

James.

From mal at egenix.com  Wed Jan 23 10:12:14 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 23 Jan 2008 10:12:14 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>
	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
Message-ID: <479704EE.8030404@egenix.com>

On 2008-01-23 02:18, James Henstridge wrote:
>> [XID format used in XA]
>> So, essentially, only the global transaction id and the branch id
>> are relevant and both are represented in the data string.
> 
> One interesting part of that is the "If OSI CCR naming is used, then
> the XID's formatID element should be set to 0; if some other format is
> used, then the formatID element should be greater than 0."
> 
> I took a quick look at a few J2EE servers (which use XA), to see what
> they do for transaction managers.  Neither JBoss or Geronimo seem to
> use formatID=0, but instead use magic numbers that I presume are
> intended to determine if they created the transaction ID.
> 
> That said, the selection of format identifiers seems a bit ad-hoc:
> Geronimo uses 0x4765526f, which has a byte representation of "GeRo".
> 
> It seems that you could do pretty much the same thing by getting TMs
> to check the global ID itself ...

So we do need to store the "formatID" as well ?

>> BTW, there's a nice extension module that let's you hook Python
>> between the TM and RM using XA:
>>
>>     http://www.hare.demon.co.uk/pyxasw/
> 
> 
> 
>>> I do see a use for the branch qualifier though.  In a distributed
>>> transaction, each resource should have a different transaction ID that
>>> share a common global transaction ID but separate branch qualifiers.
>>>
>>> As transaction IDs are global within database clusters for some
>>> backends (PostgreSQL, MySQL and probably others), the branch qualifier
>>> is necessary if two databases from the cluster are used in the global
>>> transaction.
>>>
>>> I think it is worth making the API such that it is easy to program to
>>> best practices.
>> The DB-API has always tried to not get in the way of how
>> a particular backends needs its configuration data, so
>> I think we can still have a single string using a database
>> backend specific format. This could then include one or more
>> of the above id parts.
>>
>> The implementation can then decode the string representation
>> of the transaction id components into whatever format is
>> needed by the backend.
> 
> The two reasons I see for using an object to represent transactions
> that contains a global part and branch part are:
> 
> 1. round tripping a transaction ID from xa_recover() to
> xa_commit()/xa_rollback().
> 2. Reduced restrictions on the contents of the transaction ID.
> 
> For (1), using a database adapter defined object means that it can
> represent transactions that originated elsewhere, or expose more
> information about those transactions.
> 
> For (2), if a database is using specially formatted transaction IDs at
> the Python level that get decoded into the various components, does
> that mean that the application or transaction manager glue needs to
> know how to format the IDs.
> 
> In contrast, it is pretty easy for e.g. a Postgres adapter to
> serialise/deserialise a multi-part ID (and this is what the JDBC
> driver does).

I have no objections against using an object for this anymore,
but let's please use an already existing object such as a
tuple instead of having each database module implement its own
new type.

Given that the formatID is used for some purpose as well (probably
just as identification of the TM itself), I guess we'd have
to use a 3-tuple (format id, global transaction id, branch id).

Modules should only expect to find an object that behaves like
a 3-sequence, they should accept whatever object is passed to
them and return it for the recover method.

This leaves the door open for extensions used by the TM for XID
objects.

>>>> Regarding the "xa_" prefix, I'm not much attached to it, but since
>>>> the interface does indeed look a lot like the XA interface, why not
>>>> make that reference ?
>>> Stuart's argument is that if the API differs from XA then using the
>>> xa_* prefix could be problematic for adapters that want to expose the
>>> XA API.
>>>
>>> As I don't have any experience with using XA, I can't comment one way
>>> or the other about this.
>> Fair enough. The API does resemble XA a lot, but you're right:
>> if there are differences, it's better not to make that link.
>>
>>>> It also makes it clear, that the interface
>>>> sits on top of the standard DB-API connection API and that those
>>>> methods form a unit.
>>> Having a common prefix seems sensible.  If we don't use xa_*,
>>> Federico's suggestion of tpc_* might make sense.
>> Fine, let's use "tpc_".
>>
>>>> Plus they are currently not in use by any DB-API module, so don't
>>>> interfere with existing APIs.
>>> So I guess it comes down to the following questions:
>>> 1. Are database adapters likely to want to expose more than what is
>>> covered by this proposal?
>>> 2. Would this proposed API conflict with those extensions?
>>>
>>> It isn't clear to me that people want to provide a larger API, since
>>> the few adapters that have added 2PC support have done so with APIs
>>> that are effectively a subset/simplification of this one.
>> If there's more to expose than what's in the API spec, then
>> module authors are free to do so.
>>
>> In general, the DB-API only
>> defines a fully functional common subset of what has to be
>> there to use a database backend. Extensions are possible and
>> welcome.
> 
> I agree with this, and think it is worth keeping extensibility in mind
> when designing the API.  My suggestion of using an object to represent
> a transaction ID was to make it easier for an adapter to expose more
> complex IDs in a fairly localised fashion.
> 
> 
>> Every now and then, we consider adding those extensions as
>> "standard extensions" to the DB-API. This has proven to work
>> well in the past.
>>
>> The two-phase commit methods would be another set of those
>> extensions.
> 
> Okay.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 23 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From stuart at stuartbishop.net  Wed Jan 23 14:11:53 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Wed, 23 Jan 2008 20:11:53 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <479704EE.8030404@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
	<479704EE.8030404@egenix.com>
Message-ID: <47973D19.2090904@stuartbishop.net>

M.-A. Lemburg wrote:
> On 2008-01-23 02:18, James Henstridge wrote:
>>> [XID format used in XA]
>>> So, essentially, only the global transaction id and the branch id
>>> are relevant and both are represented in the data string.
>> One interesting part of that is the "If OSI CCR naming is used, then
>> the XID's formatID element should be set to 0; if some other format is
>> used, then the formatID element should be greater than 0."
>>
>> I took a quick look at a few J2EE servers (which use XA), to see what
>> they do for transaction managers.  Neither JBoss or Geronimo seem to
>> use formatID=0, but instead use magic numbers that I presume are
>> intended to determine if they created the transaction ID.
>>
>> That said, the selection of format identifiers seems a bit ad-hoc:
>> Geronimo uses 0x4765526f, which has a byte representation of "GeRo".
>>
>> It seems that you could do pretty much the same thing by getting TMs
>> to check the global ID itself ...
> 
> So we do need to store the "formatID" as well ?

It looks like yes we do. MySQL's syntax for xids allows an optional formatid
and this is returned by XA RECOVER. In MySQL, it is a number rather than a
string. Assuming that any system that uses more than a simple string for the
xid is doing so to map onto the XA specification, we could safely represent
xids as a 3-tuple of (unicode, unicode, integer).

How to deal with None's and empty strings needs to be thought out though to
avoid round trip edge cases:

>>> con = connect('')
>>> xid = ('g', '', None)
>>> con.tpc_begin(xid)
>>> con.tpc_prepare()
>>> con.tpc_recover()
[('g', None, 1)]
>>> con.tpc_recover()[0] == xid
False

'' and None for the gtid and brid would be equivalent, and 1 and None would
be equivalent for the format_id (1 is the default format id in MySQL). To
avoid round trip issues with tuples, only one of these values should be allowed.

If we use an object, these issues go away:

>>> con = connect('')
>>> xid = Xid('g', '')
>>> tuple(xid)
('g', None, 1)
>>> con.tpc_begin(xid)
>>> con.tpc_prepare()
>>> con.tpc_recover()
[<Xid 'g', None, 1>]
>>> con.tpc_recover()[0] == xid
True

> Given that the formatID is used for some purpose as well (probably
> just as identification of the TM itself), I guess we'd have
> to use a 3-tuple (format id, global transaction id, branch id).
> 
> Modules should only expect to find an object that behaves like
> a 3-sequence, they should accept whatever object is passed to
> them and return it for the recover method.
> 
> This leaves the door open for extensions used by the TM for XID
> objects.

I don't see a technical problem with the tuple apart from the round tripping
issue above and someone might have a nice solution to that. Subjectively, I
think an object reads better though, particularly as in many cases you will
only want to bother specifying one or maybe two of the three parts.
Xid('foo') vs. ('foo', None, None).

Is CamelCase of xid 'Xid' or 'XID' or 'XId' ?

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080123/85777b53/attachment.pgp 

From mal at egenix.com  Wed Jan 23 15:24:35 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 23 Jan 2008 15:24:35 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <47973D19.2090904@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>
	<47973D19.2090904@stuartbishop.net>
Message-ID: <47974E23.8000009@egenix.com>

On 2008-01-23 14:11, Stuart Bishop wrote:
> M.-A. Lemburg wrote:
>> So we do need to store the "formatID" as well ?
> 
> It looks like yes we do. MySQL's syntax for xids allows an optional formatid
> and this is returned by XA RECOVER. In MySQL, it is a number rather than a
> string. Assuming that any system that uses more than a simple string for the
> xid is doing so to map onto the XA specification, we could safely represent
> xids as a 3-tuple of (unicode, unicode, integer).
> 
> How to deal with None's and empty strings needs to be thought out though to
> avoid round trip edge cases:
> 
>>>> con = connect('')
>>>> xid = ('g', '', None)
>>>> con.tpc_begin(xid)
>>>> con.tpc_prepare()
>>>> con.tpc_recover()
> [('g', None, 1)]
>>>> con.tpc_recover()[0] == xid
> False
> 
> '' and None for the gtid and brid would be equivalent, and 1 and None would
> be equivalent for the format_id (1 is the default format id in MySQL). To
> avoid round trip issues with tuples, only one of these values should be allowed.
> 
> If we use an object, these issues go away:

I'm not sure I understand... a tuple *is* an object after all :-)

Why does '' get converted to None on output ?

The database module
should not try to change the object in any way (regardless of whether
it's a string, tuple, custom sequence like object, etc.). At least
that's the theory.

Or is this a side-effect of MySQL doing some internal mapping of
the tuple contents to some internal table ?

>>>> con = connect('')
>>>> xid = Xid('g', '')
>>>> tuple(xid)
> ('g', None, 1)
>>>> con.tpc_begin(xid)
>>>> con.tpc_prepare()
>>>> con.tpc_recover()
> [<Xid 'g', None, 1>]
>>>> con.tpc_recover()[0] == xid
> True
> 
>> Given that the formatID is used for some purpose as well (probably
>> just as identification of the TM itself), I guess we'd have
>> to use a 3-tuple (format id, global transaction id, branch id).
>>
>> Modules should only expect to find an object that behaves like
>> a 3-sequence, they should accept whatever object is passed to
>> them and return it for the recover method.
>>
>> This leaves the door open for extensions used by the TM for XID
>> objects.
> 
> I don't see a technical problem with the tuple apart from the round tripping
> issue above and someone might have a nice solution to that. Subjectively, I
> think an object reads better though, particularly as in many cases you will
> only want to bother specifying one or maybe two of the three parts.
> Xid('foo') vs. ('foo', None, None).

I think we shouldn't restrict the TM by specifying a particular
object. After all, the DB-API is about the RM, not the TM.

However, it may be worthwhile to have the RM at least peek
into the XID object and that's why I think we should require
the XID object to implement the __getitem__ protocol and
have the first three positions defined as (format id,
global transaction id, branch id).

This should leave enough room for the TM.

> Is CamelCase of xid 'Xid' or 'XID' or 'XId' ?

Good question. XID itself is an abbreviation. I tend to
leave those alone and use all-capital-letters for classes.

Note that since the TM will create the XIDs, we don't need
to worry about a method or API to generate them.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 23 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From james at jamesh.id.au  Thu Jan 24 02:44:27 2008
From: james at jamesh.id.au (James Henstridge)
Date: Thu, 24 Jan 2008 10:44:27 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <479704EE.8030404@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<4795D9E4.2030509@egenix.com>
	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
	<479704EE.8030404@egenix.com>
Message-ID: <a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>

On 23/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-23 02:18, James Henstridge wrote:
> >> [XID format used in XA]
> >> So, essentially, only the global transaction id and the branch id
> >> are relevant and both are represented in the data string.
> >
> > One interesting part of that is the "If OSI CCR naming is used, then
> > the XID's formatID element should be set to 0; if some other format is
> > used, then the formatID element should be greater than 0."
> >
> > I took a quick look at a few J2EE servers (which use XA), to see what
> > they do for transaction managers.  Neither JBoss or Geronimo seem to
> > use formatID=0, but instead use magic numbers that I presume are
> > intended to determine if they created the transaction ID.
> >
> > That said, the selection of format identifiers seems a bit ad-hoc:
> > Geronimo uses 0x4765526f, which has a byte representation of "GeRo".
> >
> > It seems that you could do pretty much the same thing by getting TMs
> > to check the global ID itself ...
>
> So we do need to store the "formatID" as well ?
>
> >> BTW, there's a nice extension module that let's you hook Python
> >> between the TM and RM using XA:
> >>
> >>     http://www.hare.demon.co.uk/pyxasw/
> >
> >
> >
> >>> I do see a use for the branch qualifier though.  In a distributed
> >>> transaction, each resource should have a different transaction ID that
> >>> share a common global transaction ID but separate branch qualifiers.
> >>>
> >>> As transaction IDs are global within database clusters for some
> >>> backends (PostgreSQL, MySQL and probably others), the branch qualifier
> >>> is necessary if two databases from the cluster are used in the global
> >>> transaction.
> >>>
> >>> I think it is worth making the API such that it is easy to program to
> >>> best practices.
> >> The DB-API has always tried to not get in the way of how
> >> a particular backends needs its configuration data, so
> >> I think we can still have a single string using a database
> >> backend specific format. This could then include one or more
> >> of the above id parts.
> >>
> >> The implementation can then decode the string representation
> >> of the transaction id components into whatever format is
> >> needed by the backend.
> >
> > The two reasons I see for using an object to represent transactions
> > that contains a global part and branch part are:
> >
> > 1. round tripping a transaction ID from xa_recover() to
> > xa_commit()/xa_rollback().
> > 2. Reduced restrictions on the contents of the transaction ID.
> >
> > For (1), using a database adapter defined object means that it can
> > represent transactions that originated elsewhere, or expose more
> > information about those transactions.
> >
> > For (2), if a database is using specially formatted transaction IDs at
> > the Python level that get decoded into the various components, does
> > that mean that the application or transaction manager glue needs to
> > know how to format the IDs.
> >
> > In contrast, it is pretty easy for e.g. a Postgres adapter to
> > serialise/deserialise a multi-part ID (and this is what the JDBC
> > driver does).
>
> I have no objections against using an object for this anymore,
> but let's please use an already existing object such as a
> tuple instead of having each database module implement its own
> new type.
>
> Given that the formatID is used for some purpose as well (probably
> just as identification of the TM itself), I guess we'd have
> to use a 3-tuple (format id, global transaction id, branch id).
>
> Modules should only expect to find an object that behaves like
> a 3-sequence, they should accept whatever object is passed to
> them and return it for the recover method.
>
> This leaves the door open for extensions used by the TM for XID
> objects.

I've had a bit more time to think about this, and have two proposals
on how to handle transaction IDs.  I think they offer equivalent
functionality, so the choice comes down to what we want the API to
look like.

Proposal 1:
* Plain string IDs should work fine as transaction identifiers for
  applications built from scratch with that assumption: they would
  need to identify the global and branch parts in their own way.

* A plain string can be stuffed inside an XA style transaction
  identifier, even if it isn't making use of all the different
  components.

* Therefore, all methods accepting transaction IDs should accept
  strings.

* As some transaction IDs in the database might not match this simple
  form, there are two options for the recover() method:
    1. return a special object that represents the transaction, which
       will be accepted by commit()/rollback().  How string-like must
       these objects be?
    2. omit such transaction IDs from the result.

* For databases that support more structured transaction IDs (such as
  those used by XA), the 2PC methods may accept objects other than
  strings.


Proposal 2:

* Many databases follow the XA specification, so it makes sense to use
  transaction identifiers structured in the same way.

* For databases that do not use XA-style transaction IDs, it is
  usually possible to serialise such an ID into a form that it can
  work with.

* Therefore, all methods accepting transaction IDs should accept
  3-sequences of the form (formatID, gtrid, bqual).

* For databases using non-XA transaction IDs, it is possible that some
  transaction IDs might exist that do not match the serialised form.
  The recover() method has two options:
    1. return a special object representing the ID that will be
       accepted by commit()/rollback().  Such an object should act
       like a 3-sequence.
    2. omit such transaction IDs from the result.

* For databases not using XA-style transactions, the 2PC methods may
  accept objects other than 3-sequences as transaction IDs.


Both of these proposals seem to get rid of the main points of contention:
* removes the xid() constructor from the spec.
* allow use of simple objects (strings or tuples) as transaction IDs
* provides an obvious way to expose database-specific transaction IDs.

James.

From stuart at stuartbishop.net  Thu Jan 24 07:05:28 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Thu, 24 Jan 2008 13:05:28 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <47974E23.8000009@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>
	<47973D19.2090904@stuartbishop.net> <47974E23.8000009@egenix.com>
Message-ID: <47982AA8.5090502@stuartbishop.net>

M.-A. Lemburg wrote:

>> If we use an object, these issues go away:
> 
> I'm not sure I understand... a tuple *is* an object after all :-)

An object we can't define the constructor of.

> Why does '' get converted to None on output ?

Because if, say, on MySQL you do """XA PREPARE 'foo'""", MySQL will fill in
the branchid and formatid with defaults - in MySQL's case '' and 1
respectively.

> The database module
> should not try to change the object in any way (regardless of whether
> it's a string, tuple, custom sequence like object, etc.). At least
> that's the theory.
> 
> Or is this a side-effect of MySQL doing some internal mapping of
> the tuple contents to some internal table ?

The databases that support XA style xids have to be able to round trip with
the defined C data structure. This structure is the formatid, the length of
the global transaction id, the length of the branch id, and an array of
bytes containing the concatenated ids. In this structure there is no way to
differentiate a NULL from an empty string or a NULL formatid from whatever
integer you map NULL to.

I guess validation of the xid could be done by the driver in tpc_begin(),
tpc_commit(), tpc_rollback() and an exception raised if the driver detects
that round tripping via the database is not possible.

> I think we shouldn't restrict the TM by specifying a particular
> object. After all, the DB-API is about the RM, not the TM.

I don't follow this. We have to specify what object can be passed to
tpc_begin and is returned from tpc_recover. The only issue is if it is if we
force this to be a 3-tuple or whatever the driver decides to return from a
module level Xid() method.

> However, it may be worthwhile to have the RM at least peek
> into the XID object and that's why I think we should require
> the XID object to implement the __getitem__ protocol and
> have the first three positions defined as (format id,
> global transaction id, branch id).

I wouldn't say 'may be worthwhile'. I'd go for 'is essential'. If you can't
inspect the results from tpc_recover(), the method is pointless.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/ce425f43/attachment.pgp 

From stuart at stuartbishop.net  Thu Jan 24 08:21:29 2008
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Thu, 24 Jan 2008 14:21:29 +0700
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>
	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
Message-ID: <47983C79.4080700@stuartbishop.net>

James Henstridge wrote:

> Proposal 1:
> * Plain string IDs should work fine as transaction identifiers for
>   applications built from scratch with that assumption: they would
>   need to identify the global and branch parts in their own way.
> 
> * A plain string can be stuffed inside an XA style transaction
>   identifier, even if it isn't making use of all the different
>   components.
> 
> * Therefore, all methods accepting transaction IDs should accept
>   strings.
> 
> * As some transaction IDs in the database might not match this simple
>   form, there are two options for the recover() method:
>     1. return a special object that represents the transaction, which
>        will be accepted by commit()/rollback().  How string-like must
>        these objects be?
>     2. omit such transaction IDs from the result.
> 
> * For databases that support more structured transaction IDs (such as
>   those used by XA), the 2PC methods may accept objects other than
>   strings.
> 
> Proposal 2:
> 
> * Many databases follow the XA specification, so it makes sense to use
>   transaction identifiers structured in the same way.
> 
> * For databases that do not use XA-style transaction IDs, it is
>   usually possible to serialise such an ID into a form that it can
>   work with.
> 
> * Therefore, all methods accepting transaction IDs should accept
>   3-sequences of the form (formatID, gtrid, bqual).
> 
> * For databases using non-XA transaction IDs, it is possible that some
>   transaction IDs might exist that do not match the serialised form.
>   The recover() method has two options:
>     1. return a special object representing the ID that will be
>        accepted by commit()/rollback().  Such an object should act
>        like a 3-sequence.
>     2. omit such transaction IDs from the result.
> 
> * For databases not using XA-style transactions, the 2PC methods may
>   accept objects other than 3-sequences as transaction IDs.
> 
> 
> Both of these proposals seem to get rid of the main points of contention:
> * removes the xid() constructor from the spec.
> * allow use of simple objects (strings or tuples) as transaction IDs
> * provides an obvious way to expose database-specific transaction IDs.

I wouldn't call any of these a point of contention. They where points of
discussion. Attempting to remove the xid() constructor from the spec is
premature when people where just considering if tuples can be used instead.

I don't think omitting transaction ids from tpc_recover() is acceptable.
Doing so means you can't write a transaction manager that plays nicely in a
more complex environment where components may not be under our direct
control, let alone written in Python and using ths API. My use case here is
a reaper script that detects and handles or reports lost transactions.

Here is an edge case with proposal 1. Here, con happens to be a connection
to a MySQL database. Which Xid represents the prepared transaction?

>>> con.tpc_begin('foo')
>>> con.tpc_prepare()
>>> con.tpc_recover()
[<Xid 'foo', '', 1>, <Xid 'foo', '', 0>, <Xid 'foo', 'None', 1>]

You could try fixing this by returning a heterogeneous  list, but I think
this is just making the hole deeper:

>>> con.tpc_begin('foo')
>>> con.tpc_prepare()
>>> con.tpc_recover()
['foo', <Xid 'foo', '', 0>, <Xid 'foo', 'None', 1>]


Proposal 2 seems the better option. I think we need to specify that the
3-tuple cannot contain None values.

I personally feel that an Xid() constructor makes things more readable. It
also means we can have driver specific defaults for the format id rather
than no default.

tpc_begin(Xid('foo', 'bar', 1))		vs.	tpc_begin(('foo', 'bar', 1))
tpc_begin(Xid('foo', 'bar'))		vs.	tpc_begin(('foo', 'bar', 1))
tpc_begin(Xid('foo')) 			vs.	tpc_begin(('foo', '', 1))


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/bf639ea4/attachment-0001.pgp 

From james at jamesh.id.au  Thu Jan 24 09:50:32 2008
From: james at jamesh.id.au (James Henstridge)
Date: Thu, 24 Jan 2008 17:50:32 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <47983C79.4080700@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<4795D9E4.2030509@egenix.com>
	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
	<479704EE.8030404@egenix.com>
	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
	<47983C79.4080700@stuartbishop.net>
Message-ID: <a7e835d40801240050s36a2256y5a8a04e904a6f7e9@mail.gmail.com>

On 24/01/2008, Stuart Bishop <stuart at stuartbishop.net> wrote:
> James Henstridge wrote:
>
> > Proposal 1:
> > * Plain string IDs should work fine as transaction identifiers for
> >   applications built from scratch with that assumption: they would
> >   need to identify the global and branch parts in their own way.
> >
> > * A plain string can be stuffed inside an XA style transaction
> >   identifier, even if it isn't making use of all the different
> >   components.
> >
> > * Therefore, all methods accepting transaction IDs should accept
> >   strings.
> >
> > * As some transaction IDs in the database might not match this simple
> >   form, there are two options for the recover() method:
> >     1. return a special object that represents the transaction, which
> >        will be accepted by commit()/rollback().  How string-like must
> >        these objects be?
> >     2. omit such transaction IDs from the result.
> >
> > * For databases that support more structured transaction IDs (such as
> >   those used by XA), the 2PC methods may accept objects other than
> >   strings.
> >
> > Proposal 2:
> >
> > * Many databases follow the XA specification, so it makes sense to use
> >   transaction identifiers structured in the same way.
> >
> > * For databases that do not use XA-style transaction IDs, it is
> >   usually possible to serialise such an ID into a form that it can
> >   work with.
> >
> > * Therefore, all methods accepting transaction IDs should accept
> >   3-sequences of the form (formatID, gtrid, bqual).
> >
> > * For databases using non-XA transaction IDs, it is possible that some
> >   transaction IDs might exist that do not match the serialised form.
> >   The recover() method has two options:
> >     1. return a special object representing the ID that will be
> >        accepted by commit()/rollback().  Such an object should act
> >        like a 3-sequence.
> >     2. omit such transaction IDs from the result.
> >
> > * For databases not using XA-style transactions, the 2PC methods may
> >   accept objects other than 3-sequences as transaction IDs.
> >
> >
> > Both of these proposals seem to get rid of the main points of contention:
> > * removes the xid() constructor from the spec.
> > * allow use of simple objects (strings or tuples) as transaction IDs
> > * provides an obvious way to expose database-specific transaction IDs.
>
> I wouldn't call any of these a point of contention. They where points of
> discussion. Attempting to remove the xid() constructor from the spec is
> premature when people where just considering if tuples can be used instead.
>
> I don't think omitting transaction ids from tpc_recover() is acceptable.
> Doing so means you can't write a transaction manager that plays nicely in a
> more complex environment where components may not be under our direct
> control, let alone written in Python and using ths API. My use case here is
> a reaper script that detects and handles or reports lost transactions.
>
> Here is an edge case with proposal 1. Here, con happens to be a connection
> to a MySQL database. Which Xid represents the prepared transaction?
>
> >>> con.tpc_begin('foo')
> >>> con.tpc_prepare()
> >>> con.tpc_recover()
> [<Xid 'foo', '', 1>, <Xid 'foo', '', 0>, <Xid 'foo', 'None', 1>]

If we were going with proposal 1 (defaulting to strings as transaction
IDs), it would be the one that compares equal to "foo".  The exact
answer would depend on how the database adapter was implemented.


> You could try fixing this by returning a heterogeneous  list, but I think
> this is just making the hole deeper:
>
> >>> con.tpc_begin('foo')
> >>> con.tpc_prepare()
> >>> con.tpc_recover()
> ['foo', <Xid 'foo', '', 0>, <Xid 'foo', 'None', 1>]

In this case, the answer is still "the one that compares equal to 'foo'".


> Proposal 2 seems the better option. I think we need to specify that the
> 3-tuple cannot contain None values.

I suppose working with transaction IDs that couldn't be deserialised
might be easier with proposal 2.  For example, it could provide the
raw ID in one part and leave the other two None.

For proposal 2, I think we should stick to XA-compatible IDs.  That
is, formatID a number >= 0, and the global ID and branch qualifier as
strings no longer than 64 characters each.


> I personally feel that an Xid() constructor makes things more readable. It
> also means we can have driver specific defaults for the format id rather
> than no default.
>
> tpc_begin(Xid('foo', 'bar', 1))         vs.     tpc_begin(('foo', 'bar', 1))
> tpc_begin(Xid('foo', 'bar'))            vs.     tpc_begin(('foo', 'bar', 1))
> tpc_begin(Xid('foo'))                   vs.     tpc_begin(('foo', '', 1))

I don't know if adapter-specific defaults make sense.  Perhaps pick
the defaults from MySQL?

"""
As indicated by the syntax, bqual and formatID are optional. The
default bqual value is '' if not given. The default formatID value is
1 if not given.
"""

If we do have a transaction ID constructor, I think it should be a
method on the connection.  You can make use of pretty much the entire
DB-API using just a connection as an entry point (especially if the
exceptions are provided as connection attributes).  It seems sensible
to do the same here.

James.

From fog at initd.org  Thu Jan 24 09:58:59 2008
From: fog at initd.org (Federico Di Gregorio)
Date: Thu, 24 Jan 2008 09:58:59 +0100
Subject: [DB-SIG] XID format (was: Two-phase commit API proposal)
In-Reply-To: <47983C79.4080700@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<4795D9E4.2030509@egenix.com>
	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
	<479704EE.8030404@egenix.com>
	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
	<47983C79.4080700@stuartbishop.net>
Message-ID: <1201165139.6139.14.camel@mila.office.dinunzioedigregorio>

The problem here seems to be that simple strings should be supported
(xid, in the end are simple SQL strings for most backends) while it
should be possible to access single parts (format, gtrid, bqual) to play
well with the transaction managers. The thing to notice is that even if
you mix the two styles, after you compose the parts in the final xid, no
two xids can be the same string. So, what about using a 4-tuple?

(full, format, gtrid, bqual)

The application layer can pass just the 'full' parameter (a sting)
representing the xid directly, or set 'full' to None and let the driver
build the string out of the other three parts (and fill 'full' for later
reference.)

recover() returns a tuple with the 'full' slot always valorized and, if
it is possible it also fills the other three slots parsing the xid.

This way one has access to the full xid and if it was built from parts
to the single parts too. A transaction manager can discover if a
recovered() transaction belongs to it by checking the 'format' (it can
be None) and there is no need to drop xids from recover() calls.

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog at debian.org
INIT.D Developer                                           fog at initd.org
           Purtroppo i creazionisti non si sono ancora estinti. -- vodka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio
	firmata digitalmente
Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/acba9f62/attachment.pgp 

From dieter at handshake.de  Tue Jan 22 19:48:52 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Tue, 22 Jan 2008 19:48:52 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
Message-ID: <18326.14996.49402.907419@gargle.gargle.HOWL>

James Henstridge wrote at 2008-1-22 20:33 +0900:
> ...
>I do see a use for the branch qualifier though.  In a distributed
>transaction, each resource should have a different transaction ID

Why?
Why is it not equally good to use a common transaction id for
all resource managers?

>that
>share a common global transaction ID but separate branch qualifiers.
>
>As transaction IDs are global within database clusters for some
>backends (PostgreSQL, MySQL and probably others), the branch qualifier
>is necessary if two databases from the cluster are used in the global
>transaction.

They refer to the same transaction -- even when several databases
in a cluster are affected.

The transaction as a whole will want to get prepared, committed, rolledback...


-- 
Dieter

From james at jamesh.id.au  Thu Jan 24 15:16:29 2008
From: james at jamesh.id.au (James Henstridge)
Date: Thu, 24 Jan 2008 23:16:29 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <18326.18926.708382.939543@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
	<47964482.5030309@egenix.com>
	<18326.18926.708382.939543@gargle.gargle.HOWL>
Message-ID: <a7e835d40801240616u1c7892ebs64bb36c529cbe172@mail.gmail.com>

On 23/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
> M.-A. Lemburg wrote at 2008-1-22 20:31 +0100:
> > ...
> >Branch ids are used for e.g. multiple connections of the same RM
> >engaging in a global transaction. Each of those connections gets
> >its own branch id.
>
> But using multiple connections to the same RM seems to
> be an error in the first place.
>
>   Assume that a resource "R" is locked via connection "C1".
>   Assume than that "R" is requested via connection "C2".
>
>   If "C1 == C2", then the RM can see that the resource is already
>   assigned to the connection and there is no blocking.
>
>   Otherwise, the RM has not chance to recognize this and
>   the request will be blocked until the transaction is commited
>   or rolled back. There is quite a high chance, that since the
>   "R" request is blocked, there will be no commit/roll back....

Here is a concrete example:

1. create two databases on a single PostgreSQL install.
2. write an application that connects to each database (which implies
two connections).
3. try to prepare transactions on each connection using the same
transaction identifier.

One of the transactions will fail with a "transaction identifier is
already in use" error.  While each connection is accessing independent
resources, the transaction ID namespace is shared by all databases in
the cluster.

Now if you include a branch qualifier in the transaction IDs the
problem is avoided.  The MySQL documentation leads me to believe it
behaves similarly.

James.

From mal at egenix.com  Thu Jan 24 15:33:10 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Jan 2008 15:33:10 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for	two phase commit APIs?)
In-Reply-To: <47982AA8.5090502@stuartbishop.net>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>	<47973D19.2090904@stuartbishop.net>
	<47974E23.8000009@egenix.com> <47982AA8.5090502@stuartbishop.net>
Message-ID: <4798A1A6.7080701@egenix.com>

On 2008-01-24 07:05, Stuart Bishop wrote:
> M.-A. Lemburg wrote:
> 
>>> If we use an object, these issues go away:
>> I'm not sure I understand... a tuple *is* an object after all :-)
> 
> An object we can't define the constructor of.
> 
>> Why does '' get converted to None on output ?
> 
> Because if, say, on MySQL you do """XA PREPARE 'foo'""", MySQL will fill in
> the branchid and formatid with defaults - in MySQL's case '' and 1
> respectively.
> 
>> The database module
>> should not try to change the object in any way (regardless of whether
>> it's a string, tuple, custom sequence like object, etc.). At least
>> that's the theory.
>>
>> Or is this a side-effect of MySQL doing some internal mapping of
>> the tuple contents to some internal table ?
> 
> The databases that support XA style xids have to be able to round trip with
> the defined C data structure. This structure is the formatid, the length of
> the global transaction id, the length of the branch id, and an array of
> bytes containing the concatenated ids. In this structure there is no way to
> differentiate a NULL from an empty string or a NULL formatid from whatever
> integer you map NULL to.
> 
> I guess validation of the xid could be done by the driver in tpc_begin(),
> tpc_commit(), tpc_rollback() and an exception raised if the driver detects
> that round tripping via the database is not possible.

It is the database module's responsibility to make sure that the
xid can round-trip.

If we restrict the three entries of the xid tuple to be strings,
this should be easily possible by e.g.

 * combining the three strings into one and decoding this
   combination again in .tpc_recover()

 * mapping the components to ids/values that the database
   backend can handle and undoing this mapping in .tpc_recover()

 * not passing the ids to the database backend at all and
   managing the xid at the database module level

>> I think we shouldn't restrict the TM by specifying a particular
>> object. After all, the DB-API is about the RM, not the TM.
> 
> I don't follow this. We have to specify what object can be passed to
> tpc_begin and is returned from tpc_recover. The only issue is if it is if we
> force this to be a 3-tuple or whatever the driver decides to return from a
> module level Xid() method.

The important aspect is that the TM must be able to get
back an object that it can compare against whatever
it originally passed to the database module.

Perhaps we could have the TM do something along these
lines:

# From the TM:
xid = conn.xid(fid, gid, bid)
conn.tpc_begin(xid)
conn.tpc_prepare(xid)
...

# See whether there are pending transactions:
xids = conn.tpc_recover()

# Recover only those transactions that the TM has initiated:
for (fid, gid, bid) in xids:
   if tm_check_xid(fid, gid, bid):
       tm_do_recovery(fid, gid, bid)

>> However, it may be worthwhile to have the RM at least peek
>> into the XID object and that's why I think we should require
>> the XID object to implement the __getitem__ protocol and
>> have the first three positions defined as (format id,
>> global transaction id, branch id).
> 
> I wouldn't say 'may be worthwhile'. I'd go for 'is essential'. If you can't
> inspect the results from tpc_recover(), the method is pointless.

Agreed.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 24 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at egenix.com  Thu Jan 24 15:36:30 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Jan 2008 15:36:30 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>
	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
Message-ID: <4798A26E.2030201@egenix.com>

On 2008-01-24 02:44, James Henstridge wrote:
> I've had a bit more time to think about this, and have two proposals
> on how to handle transaction IDs.  I think they offer equivalent
> functionality, so the choice comes down to what we want the API to
> look like.
> 
> Proposal 1:
> * Plain string IDs should work fine as transaction identifiers for
>   applications built from scratch with that assumption: they would
>   need to identify the global and branch parts in their own way.
> 
> * A plain string can be stuffed inside an XA style transaction
>   identifier, even if it isn't making use of all the different
>   components.
> 
> * Therefore, all methods accepting transaction IDs should accept
>   strings.
> 
> * As some transaction IDs in the database might not match this simple
>   form, there are two options for the recover() method:
>     1. return a special object that represents the transaction, which
>        will be accepted by commit()/rollback().  How string-like must
>        these objects be?
>     2. omit such transaction IDs from the result.
> 
> * For databases that support more structured transaction IDs (such as
>   those used by XA), the 2PC methods may accept objects other than
>   strings.
> 
> 
> Proposal 2:
> 
> * Many databases follow the XA specification, so it makes sense to use
>   transaction identifiers structured in the same way.
> 
> * For databases that do not use XA-style transaction IDs, it is
>   usually possible to serialise such an ID into a form that it can
>   work with.
> 
> * Therefore, all methods accepting transaction IDs should accept
>   3-sequences of the form (formatID, gtrid, bqual).
> 
> * For databases using non-XA transaction IDs, it is possible that some
>   transaction IDs might exist that do not match the serialised form.
>   The recover() method has two options:
>     1. return a special object representing the ID that will be
>        accepted by commit()/rollback().  Such an object should act
>        like a 3-sequence.
>     2. omit such transaction IDs from the result.
> 
> * For databases not using XA-style transactions, the 2PC methods may
>   accept objects other than 3-sequences as transaction IDs.
> 
> 
> Both of these proposals seem to get rid of the main points of contention:
> * removes the xid() constructor from the spec.
> * allow use of simple objects (strings or tuples) as transaction IDs
> * provides an obvious way to expose database-specific transaction IDs.

I'm coming to agree with Stuart that the conn.xid() might actually
help us with this.

So I'd be in favor of proposal 2 and an .xid() constructor that
returns an object which provides a 3-sequence interface, e.g.

# Wrap the IDs for use by the database module
xid = conn.xid(fid, gid, bid)

# Use the xid
conn.tpc_begin(xid)
conn.tpc_prepare(xid)
...

# Unwrap the IDs:
fid, gid, bid = xid

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 24 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at egenix.com  Thu Jan 24 15:41:09 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Jan 2008 15:41:09 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <4798A26E.2030201@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
	<4798A26E.2030201@egenix.com>
Message-ID: <4798A385.80008@egenix.com>

On 2008-01-24 15:36, M.-A. Lemburg wrote:
> On 2008-01-24 02:44, James Henstridge wrote:
>> I've had a bit more time to think about this, and have two proposals
>> on how to handle transaction IDs.  I think they offer equivalent
>> functionality, so the choice comes down to what we want the API to
>> look like.
>>
>> Proposal 1:
>> * Plain string IDs should work fine as transaction identifiers for
>>   applications built from scratch with that assumption: they would
>>   need to identify the global and branch parts in their own way.
>>
>> * A plain string can be stuffed inside an XA style transaction
>>   identifier, even if it isn't making use of all the different
>>   components.
>>
>> * Therefore, all methods accepting transaction IDs should accept
>>   strings.
>>
>> * As some transaction IDs in the database might not match this simple
>>   form, there are two options for the recover() method:
>>     1. return a special object that represents the transaction, which
>>        will be accepted by commit()/rollback().  How string-like must
>>        these objects be?
>>     2. omit such transaction IDs from the result.
>>
>> * For databases that support more structured transaction IDs (such as
>>   those used by XA), the 2PC methods may accept objects other than
>>   strings.
>>
>>
>> Proposal 2:
>>
>> * Many databases follow the XA specification, so it makes sense to use
>>   transaction identifiers structured in the same way.
>>
>> * For databases that do not use XA-style transaction IDs, it is
>>   usually possible to serialise such an ID into a form that it can
>>   work with.
>>
>> * Therefore, all methods accepting transaction IDs should accept
>>   3-sequences of the form (formatID, gtrid, bqual).
>>
>> * For databases using non-XA transaction IDs, it is possible that some
>>   transaction IDs might exist that do not match the serialised form.
>>   The recover() method has two options:
>>     1. return a special object representing the ID that will be
>>        accepted by commit()/rollback().  Such an object should act
>>        like a 3-sequence.
>>     2. omit such transaction IDs from the result.
>>
>> * For databases not using XA-style transactions, the 2PC methods may
>>   accept objects other than 3-sequences as transaction IDs.
>>
>>
>> Both of these proposals seem to get rid of the main points of contention:
>> * removes the xid() constructor from the spec.
>> * allow use of simple objects (strings or tuples) as transaction IDs
>> * provides an obvious way to expose database-specific transaction IDs.
> 
> I'm coming to agree with Stuart that the conn.xid() might actually
> help us with this.
> 
> So I'd be in favor of proposal 2 and an .xid() constructor that
> returns an object which provides a 3-sequence interface, e.g.
> 
> # Wrap the IDs for use by the database module
> xid = conn.xid(fid, gid, bid)
> 
> # Use the xid
> conn.tpc_begin(xid)
> conn.tpc_prepare(xid)
> ...
> 
> # Unwrap the IDs:
> fid, gid, bid = xid

Plus require that all three components are strings to avoid the
None issue.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 24 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From dieter at handshake.de  Thu Jan 24 18:47:03 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Thu, 24 Jan 2008 18:47:03 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <a7e835d40801240616u1c7892ebs64bb36c529cbe172@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
	<47964482.5030309@egenix.com>
	<18326.18926.708382.939543@gargle.gargle.HOWL>
	<a7e835d40801240616u1c7892ebs64bb36c529cbe172@mail.gmail.com>
Message-ID: <18328.53015.139548.358620@gargle.gargle.HOWL>

James Henstridge wrote at 2008-1-24 23:16 +0900:
>On 23/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
> ...
>Here is a concrete example:
>
>1. create two databases on a single PostgreSQL install.
>2. write an application that connects to each database (which implies
>two connections).
>3. try to prepare transactions on each connection using the same
>transaction identifier.
>
>One of the transactions will fail with a "transaction identifier is
>already in use" error.  While each connection is accessing independent
>resources, the transaction ID namespace is shared by all databases in
>the cluster.
>
>Now if you include a branch qualifier in the transaction IDs the
>problem is avoided.  The MySQL documentation leads me to believe it
>behaves similarly.

This description suggests that the TM provides the "main" transaction
identifier and the resource manager could add the branch part.


-- 
Dieter

From james at jamesh.id.au  Fri Jan 25 01:48:10 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 25 Jan 2008 09:48:10 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <18328.53015.139548.358620@gargle.gargle.HOWL>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
	<47964482.5030309@egenix.com>
	<18326.18926.708382.939543@gargle.gargle.HOWL>
	<a7e835d40801240616u1c7892ebs64bb36c529cbe172@mail.gmail.com>
	<18328.53015.139548.358620@gargle.gargle.HOWL>
Message-ID: <a7e835d40801241648r1f2bb650k3dc3d484436ddeaf@mail.gmail.com>

On 25/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
> This description suggests that the TM provides the "main" transaction
> identifier and the resource manager could add the branch part.

I guess the reason why the TM generally assigns the branch qualifiers
in XA systems is that it is in the best place to do so: it can simply
issue sequential numbers to each resource that joins the transaction.
An RM has no knowledge of what other branch qualifiers have been used
so would need to do something more comlpex.

Now whether the TM or RM generates the branch qualifier, I'd expect
that the TM needs to know all the full transaction IDs if it is to
properly handle recovery.  If the RM is generating the ID, then the TM
would now need some way to retrieve that ID.

James.

From james at jamesh.id.au  Fri Jan 25 01:54:40 2008
From: james at jamesh.id.au (James Henstridge)
Date: Fri, 25 Jan 2008 09:54:40 +0900
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
	for two phase commit APIs?)
In-Reply-To: <4798A385.80008@egenix.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<4795D9E4.2030509@egenix.com>
	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>
	<479704EE.8030404@egenix.com>
	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>
	<4798A26E.2030201@egenix.com> <4798A385.80008@egenix.com>
Message-ID: <a7e835d40801241654m5275b867r36060035aa4df0c@mail.gmail.com>

On 24/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-24 15:36, M.-A. Lemburg wrote:
> > On 2008-01-24 02:44, James Henstridge wrote:
> >> I've had a bit more time to think about this, and have two proposals
> >> on how to handle transaction IDs.  I think they offer equivalent
> >> functionality, so the choice comes down to what we want the API to
> >> look like.
> >>
> >> Proposal 1:
> >> * Plain string IDs should work fine as transaction identifiers for
> >>   applications built from scratch with that assumption: they would
> >>   need to identify the global and branch parts in their own way.
> >>
> >> * A plain string can be stuffed inside an XA style transaction
> >>   identifier, even if it isn't making use of all the different
> >>   components.
> >>
> >> * Therefore, all methods accepting transaction IDs should accept
> >>   strings.
> >>
> >> * As some transaction IDs in the database might not match this simple
> >>   form, there are two options for the recover() method:
> >>     1. return a special object that represents the transaction, which
> >>        will be accepted by commit()/rollback().  How string-like must
> >>        these objects be?
> >>     2. omit such transaction IDs from the result.
> >>
> >> * For databases that support more structured transaction IDs (such as
> >>   those used by XA), the 2PC methods may accept objects other than
> >>   strings.
> >>
> >>
> >> Proposal 2:
> >>
> >> * Many databases follow the XA specification, so it makes sense to use
> >>   transaction identifiers structured in the same way.
> >>
> >> * For databases that do not use XA-style transaction IDs, it is
> >>   usually possible to serialise such an ID into a form that it can
> >>   work with.
> >>
> >> * Therefore, all methods accepting transaction IDs should accept
> >>   3-sequences of the form (formatID, gtrid, bqual).
> >>
> >> * For databases using non-XA transaction IDs, it is possible that some
> >>   transaction IDs might exist that do not match the serialised form.
> >>   The recover() method has two options:
> >>     1. return a special object representing the ID that will be
> >>        accepted by commit()/rollback().  Such an object should act
> >>        like a 3-sequence.
> >>     2. omit such transaction IDs from the result.
> >>
> >> * For databases not using XA-style transactions, the 2PC methods may
> >>   accept objects other than 3-sequences as transaction IDs.
> >>
> >>
> >> Both of these proposals seem to get rid of the main points of contention:
> >> * removes the xid() constructor from the spec.
> >> * allow use of simple objects (strings or tuples) as transaction IDs
> >> * provides an obvious way to expose database-specific transaction IDs.
> >
> > I'm coming to agree with Stuart that the conn.xid() might actually
> > help us with this.
> >
> > So I'd be in favor of proposal 2 and an .xid() constructor that
> > returns an object which provides a 3-sequence interface, e.g.

So is the 3-sequence behaviour intended to allow application code to
inspect a transaction ID, or are tpc_begin(), etc expected to accept
arbitrary 3-sequences too?

> >
> > # Wrap the IDs for use by the database module
> > xid = conn.xid(fid, gid, bid)
> >
> > # Use the xid
> > conn.tpc_begin(xid)
> > conn.tpc_prepare(xid)
> > ...
> >
> > # Unwrap the IDs:
> > fid, gid, bid = xid
>
> Plus require that all three components are strings to avoid the
> None issue.

If we are going with 3-part XA-style transaction IDs, the format ID
should be a non-negative 32-bit integer and the other two should be
strings with a maximum length of 64 bytes (possibly with some
restrictions on allowed characters?).

James.

From mal at egenix.com  Fri Jan 25 10:45:52 2008
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 25 Jan 2008 10:45:52 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard
 for two phase commit APIs?)
In-Reply-To: <a7e835d40801241654m5275b867r36060035aa4df0c@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>	<4795C6CE.8060301@egenix.com>	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>	<4795D9E4.2030509@egenix.com>	<a7e835d40801221718g4b4b04dej33df0eb69fd8f177@mail.gmail.com>	<479704EE.8030404@egenix.com>	<a7e835d40801231744w158b7b61o563df648d5b69781@mail.gmail.com>	<4798A26E.2030201@egenix.com>
	<4798A385.80008@egenix.com>
	<a7e835d40801241654m5275b867r36060035aa4df0c@mail.gmail.com>
Message-ID: <4799AFD0.3000905@egenix.com>

On 2008-01-25 01:54, James Henstridge wrote:
>>>> Proposal 2:
>>>>
>>>> * Many databases follow the XA specification, so it makes sense to use
>>>>   transaction identifiers structured in the same way.
>>>>
>>>> * For databases that do not use XA-style transaction IDs, it is
>>>>   usually possible to serialise such an ID into a form that it can
>>>>   work with.
>>>>
>>>> * Therefore, all methods accepting transaction IDs should accept
>>>>   3-sequences of the form (formatID, gtrid, bqual).
>>>>
>>>> * For databases using non-XA transaction IDs, it is possible that some
>>>>   transaction IDs might exist that do not match the serialised form.
>>>>   The recover() method has two options:
>>>>     1. return a special object representing the ID that will be
>>>>        accepted by commit()/rollback().  Such an object should act
>>>>        like a 3-sequence.
>>>>     2. omit such transaction IDs from the result.
>>>>
>>>> * For databases not using XA-style transactions, the 2PC methods may
>>>>   accept objects other than 3-sequences as transaction IDs.
>>>>
>>>>
>>>> Both of these proposals seem to get rid of the main points of contention:
>>>> * removes the xid() constructor from the spec.
>>>> * allow use of simple objects (strings or tuples) as transaction IDs
>>>> * provides an obvious way to expose database-specific transaction IDs.
>>> I'm coming to agree with Stuart that the conn.xid() might actually
>>> help us with this.
>>>
>>> So I'd be in favor of proposal 2 and an .xid() constructor that
>>> returns an object which provides a 3-sequence interface, e.g.
> 
> So is the 3-sequence behaviour intended to allow application code to
> inspect a transaction ID, or are tpc_begin(), etc expected to accept
> arbitrary 3-sequences too?

I'd say we put the .xid() as interface between the TM and
the .tpc_*() methods, like Stuart suggested.

That way, the TM has a clear interface to construct an XID
interface, while the RM has control over what is passed to its
.tpc_*() methods and can also use other means of creating
these object (if needed).

By using the 3-sequence interface, the TM can also easily
recover the data it passed to the .xid() constructor when
getting back data from .tpc_recover(), so it is round-trip
safe.

>>> # Wrap the IDs for use by the database module
>>> xid = conn.xid(fid, gid, bid)
>>>
>>> # Use the xid
>>> conn.tpc_begin(xid)
>>> conn.tpc_prepare(xid)
>>> ...
>>>
>>> # Unwrap the IDs:
>>> fid, gid, bid = xid
>> Plus require that all three components are strings to avoid the
>> None issue.
> 
> If we are going with 3-part XA-style transaction IDs, the format ID
> should be a non-negative 32-bit integer and the other two should be
> strings with a maximum length of 64 bytes (possibly with some
> restrictions on allowed characters?).

Ok, if that's the GCD of what backends use, let's go with that.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 25 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From dieter at handshake.de  Fri Jan 25 20:47:29 2008
From: dieter at handshake.de (Dieter Maurer)
Date: Fri, 25 Jan 2008 20:47:29 +0100
Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any
	standard	for two phase commit APIs?)
In-Reply-To: <a7e835d40801241648r1f2bb650k3dc3d484436ddeaf@mail.gmail.com>
References: <a7e835d40801210208h6bac5437m70badaf870eb4b0e@mail.gmail.com>
	<a7e835d40801210640h69da9d08y677f9342932ae709@mail.gmail.com>
	<4795C6CE.8060301@egenix.com>
	<a7e835d40801220333g8caacc8v6731ccd81e885923@mail.gmail.com>
	<18326.14996.49402.907419@gargle.gargle.HOWL>
	<47964482.5030309@egenix.com>
	<18326.18926.708382.939543@gargle.gargle.HOWL>
	<a7e835d40801240616u1c7892ebs64bb36c529cbe172@mail.gmail.com>
	<18328.53015.139548.358620@gargle.gargle.HOWL>
	<a7e835d40801241648r1f2bb650k3dc3d484436ddeaf@mail.gmail.com>
Message-ID: <18330.15569.288412.903184@gargle.gargle.HOWL>

James Henstridge wrote at 2008-1-25 09:48 +0900:
>On 25/01/2008, Dieter Maurer <dieter at handshake.de> wrote:
>> This description suggests that the TM provides the "main" transaction
>> identifier and the resource manager could add the branch part.
>
>I guess the reason why the TM generally assigns the branch qualifiers
>in XA systems is that it is in the best place to do so: it can simply
>issue sequential numbers to each resource that joins the transaction.
>An RM has no knowledge of what other branch qualifiers have been used
>so would need to do something more comlpex.

It could identify itself -- and then there would be no need
to know other branch qualifiers.

>Now whether the TM or RM generates the branch qualifier, I'd expect
>that the TM needs to know all the full transaction IDs if it is to
>properly handle recovery.  If the RM is generating the ID, then the TM
>would now need some way to retrieve that ID.

The "conn.xid" could provide the part identifying "conn".


-- 
Dieter

From szybalski at gmail.com  Thu Jan 31 15:47:26 2008
From: szybalski at gmail.com (Lukasz Szybalski)
Date: Thu, 31 Jan 2008 08:47:26 -0600
Subject: [DB-SIG] db to db layout analysis
Message-ID: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>

Hello,
I just came across this database with over 60 tables and I need some
tool to analyze the tables. (find out keys, fields, properties, show
me relation to other tables etc.)

You guys know of something similar? python or not, command line or not

Thanks,
Lucas

From fabien.coutant at neuf.fr  Thu Jan 31 18:55:05 2008
From: fabien.coutant at neuf.fr (Fabien COUTANT)
Date: Thu, 31 Jan 2008 18:55:05 +0100
Subject: [DB-SIG] db to db layout analysis
In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
Message-ID: <20080131185505.929a8fb4.fabien.coutant@neuf.fr>

Le Thu, 31 Jan 2008 08:47:26 -0600, Lukasz Szybalski a ?crit:
> Hello,
> I just came across this database with over 60 tables and I need some
> tool to analyze the tables. (find out keys, fields, properties, show
> me relation to other tables etc.)
> 
> You guys know of something similar? python or not, command line or not

Hi,
Note this is a general database question, not Python-specific (which is the
subject of this list).
However I will suggest http://squirrel-sql.sourceforge.net/ if you can
accept a Java/GUI program... I use it daily and I think it has the features
you ask for. There's even a plugin that will make a drawing of your tables
relationships.

-- 
Hope this helps,
Fabien.

From andy47 at halfcooked.com  Thu Jan 31 21:18:13 2008
From: andy47 at halfcooked.com (Andy Todd)
Date: Fri, 01 Feb 2008 07:18:13 +1100
Subject: [DB-SIG] db to db layout analysis
In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
Message-ID: <47A22D05.6080008@halfcooked.com>

Lukasz Szybalski wrote:
> Hello,
> I just came across this database with over 60 tables and I need some
> tool to analyze the tables. (find out keys, fields, properties, show
> me relation to other tables etc.)
> 
> You guys know of something similar? python or not, command line or not
> 
> Thanks,
> Lucas
> _______________________________________________
> DB-SIG maillist  -  DB-SIG at python.org
> http://mail.python.org/mailman/listinfo/db-sig

http://halfcooked.com/code/gerald

Regards,
Andy
-- 
 From the desk of Andrew J Todd esq - http://www.halfcooked.com/

From Frederic.VanderElst at phgroup.com  Thu Jan 31 22:55:46 2008
From: Frederic.VanderElst at phgroup.com (Frederic Vander Elst)
Date: Thu, 31 Jan 2008 21:55:46 +0000
Subject: [DB-SIG] db to db layout analysis
In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com>
Message-ID: <47A243E2.3020902@phgroup.com>

Lukasz

I have used Schema Spy (java, produces pretty html docs and drawings, 
interpreting foreign keys, etc), and would heartily recommend it.

See http://schemaspy.sourceforge.net/

-f


Lukasz Szybalski wrote:
> Hello,
> I just came across this database with over 60 tables and I need some
> tool to analyze the tables. (find out keys, fields, properties, show
> me relation to other tables etc.)
>
> You guys know of something similar? python or not, command line or not
>
> Thanks,
> Lucas
> _______________________________________________
> DB-SIG maillist  -  DB-SIG at python.org
> http://mail.python.org/mailman/listinfo/db-sig
>   

-- 
--------------------------
Frederic Vander Elst
pH, an Experian Company
www.phgroup.com
Direct Line: 020 7598 0320
Office Line: 020 7598 0310
Fax: 020 7598 0311
--------------------------