From phd at phd.pp.ru Thu Jan 10 13:27:14 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 Jan 2008 15:27:14 +0300 Subject: [DB-SIG] SQLObject 0.7.10 Message-ID: <20080110122714.GC3070@phd.pp.ru> Hello! I'm pleased to announce the 0.7.10 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.10 News and changes: http://sqlobject.org/docs/News.html What's New ========== News since 0.7.9 ---------------- * With PySQLite2 do not use encode()/decode() from PySQLite1 - always use base64 for BLOBs. * MySQLConnection doesn't convert query strings to unicode (but allows to pass unicode query strings if the user build ones). DB URI parameter sqlobject_encoding is no longer used. For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Thu Jan 10 13:32:54 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 Jan 2008 15:32:54 +0300 Subject: [DB-SIG] SQLObject 0.8.7 Message-ID: <20080110123254.GG3070@phd.pp.ru> Hello! I'm pleased to announce the 0.8.7 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.8.7 News and changes: http://sqlobject.org/News.html What's New ========== News since 0.8.6 ---------------- * With PySQLite2 do not use encode()/decode() from PySQLite1 - always use base64 for BLOBs. * MySQLConnection doesn't convert query strings to unicode (but allows to pass unicode query strings if the user build ones). DB URI parameter sqlobject_encoding is no longer used. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Thu Jan 10 13:38:25 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 Jan 2008 15:38:25 +0300 Subject: [DB-SIG] SQLObject 0.9.3 Message-ID: <20080110123825.GK3070@phd.pp.ru> Hello! I'm pleased to announce the 0.9.3 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.9.3 News and changes: http://sqlobject.org/News.html What's New ========== Bug Fixes ~~~~~~~~~ * With PySQLite2 do not use encode()/decode() from PySQLite1 - always use base64 for BLOBs. * MySQLConnection doesn't convert query strings to unicode (but allows to pass unicode query strings if the user build ones). DB URI parameter sqlobject_encoding is no longer used. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From omranju at gmail.com Fri Jan 11 14:51:50 2008 From: omranju at gmail.com (OM Ranju) Date: Fri, 11 Jan 2008 20:51:50 +0700 Subject: [DB-SIG] Requests Message-ID: Kindly clear me about dictionary deepily -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20080111/60498e8d/attachment.htm From carsten at uniqsys.com Fri Jan 11 15:13:44 2008 From: carsten at uniqsys.com (Carsten Haese) Date: Fri, 11 Jan 2008 09:13:44 -0500 Subject: [DB-SIG] Requests In-Reply-To: References: Message-ID: <1200060824.3433.10.camel@dot.uniqsys.com> On Fri, 2008-01-11 at 20:51 +0700, OM Ranju wrote: > Kindly clear me about dictionary deepily This is not understandable as an English sentence. I am guessing that you have a question about a dictionary, but it's not obvious what it is you wish to know or even whether you are asking about a dictionary as a Python data structure or about a dictionary as a list of word definitions/translations. Please rephrase your question and provide more detail about what you need to know, and maybe then we'll be able to help you. -- Carsten Haese http://informixdb.sourceforge.net From phd at phd.pp.ru Fri Jan 11 16:24:15 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 11 Jan 2008 18:24:15 +0300 Subject: [DB-SIG] SQLObject 0.10.0b1 Message-ID: <20080111152415.GG26551@phd.pp.ru> Hello! I'm pleased to announce the 0.10.0b1, the first beta release of a new SQLObject branch, 0.10. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.10.0b1 News and changes: http://sqlobject.org/News.html What's New ========== Features & Interface -------------------- * Dropped support for Python 2.2. The minimal version of Python for SQLObject is 2.3 now. * Removed actively deprecated attributes; lowered deprecation level for other attributes to be removed after 0.10. * SQLBuilder Select supports the rest of SelectResults options (reversed, distinct, joins, etc.) * SQLObject.select() (i.e., SelectResults) and DBConnection.queryForSelect() use SQLBuilder Select queries; this make all SELECTs implemented internally via a single mechanism. * SQLBuilder Joins handle SQLExpression tables (not just str/SQLObject/Alias) and properly sqlrepr. * SQLBuilder tablesUsedDict handles sqlrepr'able objects. * Added SQLBuilder ImportProxy. It allows one to ignore the circular import issues with referring to SQLObject classes in other files - it uses the classregistry as the string class names for FK/Joins do, but specifically intended for SQLBuilder expressions. See tests/test_sqlbuilder_importproxy.py. * Added SelectResults.throughTo. It allows one to traverse relationships (FK/Join) via SQL, avoiding the intermediate objects. Additionally, it's a simple mechanism for pre-caching/eager-loading of later FK relationships (i.e., going to loop over a select of somePeople and ask for aPerson.group, first call list(somePeople.throughTo.group) to preload those related groups and use 2 db queries instead of N+1). See tests/test_select_through.py. * Added ViewSQLObject. * Added sqlmeta.getColumns() to get all the columns for a class (including parent classes), excluding the column 'childName' and including the column 'id'. sqlmeta.asDict() now uses getColumns(), so there is no need to override it in the inheritable sqlmeta class; this makes asDict() to work properly on inheritable sqlobjects. * Changed the implementation type in BoolCol under SQLite from TINYINT to BOOLEAN and made fromDatabase machinery to recognize it. * Added rich comparison methods; SQLObjects of the same class are considered equal is they have the same id; other methods return NotImplemented. * MySQLConnection (and DB URI) accept a number of SSL-related parameters: ssl_key, ssl_cert, ssl_ca, ssl_capath. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From mwm at mired.org Fri Jan 11 17:35:33 2008 From: mwm at mired.org (Mike Meyer) Date: Fri, 11 Jan 2008 11:35:33 -0500 Subject: [DB-SIG] Handling an open database connection after a fork? Message-ID: <20080111113533.7ba8fc47@mbook.mired.org> I have an application that's using oracle (via cx_Oracle) to log events (among other things). It runs in multiple processes, forking new processes as it needs them. I.e. db = cx_Oracle.connect(.....) cu = db.cursor() [do various things, including sql inserts and commits] if fork(): # Parent wants to keep the existing database connection. else: # Child wants a database connection. So the question is - what should the child do to get a database connection? Can it just keep using the existing db & cu variables? If not, does it need to do anything special, or avoid doing anything, in order to not disrupt the parent processes use of those variables? thanx, http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From mal at egenix.com Sat Jan 12 14:14:03 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 12 Jan 2008 14:14:03 +0100 Subject: [DB-SIG] Handling an open database connection after a fork? In-Reply-To: <20080111113533.7ba8fc47@mbook.mired.org> References: <20080111113533.7ba8fc47@mbook.mired.org> Message-ID: <4788BD1B.4050103@egenix.com> On 2008-01-11 17:35, Mike Meyer wrote: > I have an application that's using oracle (via cx_Oracle) to log > events (among other things). It runs in multiple processes, forking > new processes as it needs them. > > I.e. > > db = cx_Oracle.connect(.....) > cu = db.cursor() > > [do various things, including sql inserts and commits] > > if fork(): > # Parent wants to keep the existing database connection. > else: > # Child wants a database connection. > > So the question is - what should the child do to get a database > connection? Can it just keep using the existing db & cu variables? If > not, does it need to do anything special, or avoid doing anything, in > order to not disrupt the parent processes use of those variables? That depends on the database module you're using. It may be enough to close all connections and reopen them in the fork. In other cases, you need to reload the database module as well (e.g. if the module sets up a work environment that holds caches, etc.). In general, it's better to avoid all this and only load the module for the first time after the fork (both in the parent and child processes). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 12 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mwm-keyword-dbsig.588a7d at mired.org Sat Jan 12 21:27:45 2008 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sat, 12 Jan 2008 15:27:45 -0500 Subject: [DB-SIG] Handling an open database connection after a fork? In-Reply-To: <4788BD1B.4050103@egenix.com> References: <20080111113533.7ba8fc47@mbook.mired.org> <4788BD1B.4050103@egenix.com> Message-ID: <20080112152745.3c80edda@bhuda.mired.org> On Sat, 12 Jan 2008 14:14:03 +0100 "M.-A. Lemburg" wrote: > On 2008-01-11 17:35, Mike Meyer wrote: > > I have an application that's using oracle (via cx_Oracle) to log > > events (among other things). It runs in multiple processes, forking > > new processes as it needs them. > > > > I.e. > > > > db = cx_Oracle.connect(.....) > > cu = db.cursor() > > > > [do various things, including sql inserts and commits] > > > > if fork(): > > # Parent wants to keep the existing database connection. > > else: > > # Child wants a database connection. > > > > So the question is - what should the child do to get a database > > connection? Can it just keep using the existing db & cu variables? If > > not, does it need to do anything special, or avoid doing anything, in > > order to not disrupt the parent processes use of those variables? > > That depends on the database module you're using. As stated, cx_Oracle. > In general, it's better to avoid all this and only load the module > for the first time after the fork (both in the parent and child > processes). Not possible. Which is why I need to find out what to do to make oracle (via cx_Oracle) happy. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From Dieter.Maurer at Haufe.de Sun Jan 13 13:10:48 2008 From: Dieter.Maurer at Haufe.de (Dieter.Maurer at Haufe.de) Date: Sun, 13 Jan 2008 13:10:48 +0100 Subject: [DB-SIG] Handling an open database connection after a fork? In-Reply-To: <20080111113533.7ba8fc47@mbook.mired.org> References: <20080111113533.7ba8fc47@mbook.mired.org> Message-ID: <18313.65480.265082.886057@gargle.gargle.HOWL> Mike Meyer wrote at 2008-1-11 11:35 -0500: > ... existing connection in forked children ... >So the question is - what should the child do to get a database >connection? Can it just keep using the existing db & cu variables? This is very unlikely. I have had severe problems with different systems (ZODB connections, LDAP connections). Not with Oracle connections, but probably only because I do not use Oracle. When the child is forked, it inherits the connections from the parent -- but the protocols usually do not expect that several processes (parent and child) are using them asynchronously. In a single process, locks are often used to synchronize access to a shared connection from different processes -- but normal locks do not work across different processes -- and shared memory semaphores are not that often used. >If >not, does it need to do anything special, or avoid doing anything, in >order to not disrupt the parent processes use of those variables? Open a new connection in your forked child. It is not garanteed that this is sufficient. For the ZODB, I have to take additional precautions. I finally abondoned this approach completely (because, LDAP was used deeply in my system and I had no control over the creation of new connections) and am now using "fork+exec". -- Viele Gr??e Dieter Tel: 06881-7327 (Festnetz) oder 06881-5590036 (Internet) From mwm-keyword-dbsig.588a7d at mired.org Sun Jan 13 20:07:09 2008 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sun, 13 Jan 2008 14:07:09 -0500 Subject: [DB-SIG] Handling an open database connection after a fork? In-Reply-To: <18313.65480.265082.886057@gargle.gargle.HOWL> References: <20080111113533.7ba8fc47@mbook.mired.org> <18313.65480.265082.886057@gargle.gargle.HOWL> Message-ID: <20080113140709.72759d28@bhuda.mired.org> On Sun, 13 Jan 2008 13:10:48 +0100 Dieter.Maurer at Haufe.de wrote: > Mike Meyer wrote at 2008-1-11 11:35 -0500: > > ... existing connection in forked children ... > >So the question is - what should the child do to get a database > >connection? Can it just keep using the existing db & cu variables? > This is very unlikely. That's what I expected. > When the child is forked, it inherits the connections from the > parent -- but the protocols usually do not expect that several > processes (parent and child) are using them asynchronously. Right. The question is, what's the right way to handle the connection on the child side of things? > >If > >not, does it need to do anything special, or avoid doing anything, in > >order to not disrupt the parent processes use of those variables? > Open a new connection in your forked child. Obvious. What do I do with the old one? I started out by explicitly closing it, but that seems to make oracle unhappy (internal errors of various kinds). > I finally abondoned this approach completely (because, LDAP > was used deeply in my system and I had no control over the creation > of new connections) and am now using "fork+exec". Oddly enough, fork+exec doesn't make the problem go away, just provides another possible solution. Open fd's can either be closed on exec, or not. Hopefully, it's closed because the python objects that referred to it are lost across the exec. I'm willing to believe that should work. So how do I simulate what happens on exec without actually doing the exec? Thanks, http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From omranju at gmail.com Mon Jan 14 14:57:51 2008 From: omranju at gmail.com (OM Ranju) Date: Mon, 14 Jan 2008 20:57:51 +0700 Subject: [DB-SIG] Requests Message-ID: Respected Sir, How can i give printer settins to the customer -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20080114/9b12d815/attachment.htm From mal at egenix.com Mon Jan 14 16:38:30 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 14 Jan 2008 16:38:30 +0100 Subject: [DB-SIG] Requests In-Reply-To: References: Message-ID: <478B81F6.6080805@egenix.com> On 2008-01-14 14:57, OM Ranju wrote: > Respected Sir, > How can i give printer settins to the customer This list is about Python & databases, not printers. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 14 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From dieter at handshake.de Mon Jan 14 19:34:44 2008 From: dieter at handshake.de (Dieter Maurer) Date: Mon, 14 Jan 2008 19:34:44 +0100 Subject: [DB-SIG] Handling an open database connection after a fork? In-Reply-To: <20080113140709.72759d28@bhuda.mired.org> References: <20080111113533.7ba8fc47@mbook.mired.org> <18313.65480.265082.886057@gargle.gargle.HOWL> <20080113140709.72759d28@bhuda.mired.org> Message-ID: <18315.43844.657101.933850@gargle.gargle.HOWL> Mike Meyer wrote at 2008-1-13 14:07 -0500: > ... >On Sun, 13 Jan 2008 13:10:48 +0100 Dieter.Maurer at Haufe.de wrote: > ... >> >If >> >not, does it need to do anything special, or avoid doing anything, in >> >order to not disrupt the parent processes use of those variables? >> Open a new connection in your forked child. > >Obvious. What do I do with the old one? I started out by explicitly >closing it, but that seems to make oracle unhappy (internal errors of >various kinds). Your best bet is to leave it alone. If you are lucky (!) then this will be sufficient. As I mentioned, for a ZODB it was not sufficient -- because the child intercepted messages destined for the parent and eat them away. If you face similar problems, give up and "exec" in the forked process. >> I finally abondoned this approach completely (because, LDAP >> was used deeply in my system and I had no control over the creation >> of new connections) and am now using "fork+exec". > >Oddly enough, fork+exec doesn't make the problem go away, just >provides another possible solution. Maybe, you state precisely what problem you have. Usually, it is not a problem that the execed child has some open fds that it does not need. When it is, you can explicitely close all connections, e.g. before you exec. >Open fd's can either be closed on >exec, or not. Hopefully, it's closed because the python objects that >referred to it are lost across the exec. No, they remain open (as the complete memory state is replaces -- there is no way, Python can intercept the "exec"). >I'm willing to believe that >should work. So how do I simulate what happens on exec without >actually doing the exec? You do the "exec" -- you cannot similate it (unless you are using deep and very low level system magic, not directly supported by Python). -- Dieter From james at jamesh.id.au Fri Jan 18 09:37:46 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 17:37:46 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? Message-ID: Hello, I've been looking at implementing two phase commit for the psycopg2 driver for PostgreSQL. It was suggested that I bring up the issue on this list to see if there were any suggestions about what form the API should take. The API from the initial patch I produced stuck pretty close to the PostgreSQL API, adding three methods to the connection object: prepare_transaction(xid) - prepare the transaction, using the given ID. This closes off the transaction, allowing a new one to be started (if needed). commit_prepared(xid) - commit a previously prepared transaction . Must be called outside of a transaction (i.e. no execute() calls since the last commit/rollback). rollback_prepared(xid) - rollback a previously prepared transaction. The idea being that this should be enough to plug psycopg2 into a transaction manager such as Zope's transaction module or similar. I understand that this API might not be implementable by other database adapters, which brings up the question: what would be a good API? >From a quick search, I found two other adapters implementing 2pc both with incompatible APIs: kinterbasdb implements a Connection.prepare() method, which performs the first phase and causes a subsequent commit() or rollback() to complete that transaction. Transaction identifiers are not exposed by the API. pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to add any explicit API to the connection object, but DCOracle2 does have an incompatible prepare() method used for prepared statements. So is there any recommendations for what a two-phase commit API should look like? James. From mal at egenix.com Fri Jan 18 10:31:17 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 18 Jan 2008 10:31:17 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: References: Message-ID: <479071E5.7000000@egenix.com> On 2008-01-18 09:37, James Henstridge wrote: > Hello, > > I've been looking at implementing two phase commit for the psycopg2 > driver for PostgreSQL. It was suggested that I bring up the issue on > this list to see if there were any suggestions about what form the API > should take. > > The API from the initial patch I produced stuck pretty close to the > PostgreSQL API, adding three methods to the connection object: > > prepare_transaction(xid) - prepare the transaction, using the given > ID. This closes off the transaction, allowing a new one to be started > (if needed). > > commit_prepared(xid) - commit a previously prepared transaction . Must > be called outside of a transaction (i.e. no execute() calls since the > last commit/rollback). > > rollback_prepared(xid) - rollback a previously prepared transaction. > > The idea being that this should be enough to plug psycopg2 into a > transaction manager such as Zope's transaction module or similar. Zope doesn't require any specific additional APIs to hook the database module into its transaction mechanism. While you do need a wrapper (the Zope DA), the three methods used by the Zope TM easily map onto the standard .commit() and .rollback() methods of the database interface. > I understand that this API might not be implementable by other > database adapters, which brings up the question: what would be a good > API? > >>From a quick search, I found two other adapters implementing 2pc both > with incompatible APIs: > > kinterbasdb implements a Connection.prepare() method, which performs > the first phase and causes a subsequent commit() or rollback() to > complete that transaction. Transaction identifiers are not exposed by > the API. > > pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to > add any explicit API to the connection object, but DCOracle2 does have > an incompatible prepare() method used for prepared statements. pymqi is a wrapper for IBM MQSeries which can act as XA-compliant two-phase commit transaction manager (TM). For this to work, the underlying database interface has to be compatible to the XA specification, which is essentially a C interface used directly by the TM. Note that XA implements transactions completely outside the normal scope of the Python database module, ie. you may not call .commit() or .rollback() on the connection objects, but instead have to register with the XA TM any action that you plan to have as part of a two-phase commit transaction. BTW, I'm not sure whether you are interpreting the .prepare() correctly: this only prepares a statement for later execution, it doesn't do the first part of a two-phase commit which would be to save the current transaction log and check whether it could potentially be committed without problems. > So is there any recommendations for what a two-phase commit API should > look like? It depends a lot on what you're trying to solve. In general, you usually have to adjust to an existing transaction manager and that then defines the interface to use. The are two industry standards for this: XA (X/Open) and DTC (MS). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From jeff at taupro.com Fri Jan 18 11:40:55 2008 From: jeff at taupro.com (Jeff Rush) Date: Fri, 18 Jan 2008 04:40:55 -0600 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <479071E5.7000000@egenix.com> References: <479071E5.7000000@egenix.com> Message-ID: <47908237.6010001@taupro.com> M.-A. Lemburg wrote: > On 2008-01-18 09:37, James Henstridge wrote: >> >> I've been looking at implementing two phase commit for the psycopg2 >> driver for PostgreSQL. It was suggested that I bring up the issue on >> this list to see if there were any suggestions about what form the API >> should take. Thank you! I've been wanting a two-phase commit API for a long time, to use with Zope myself. >> The API from the initial patch I produced stuck pretty close to the >> PostgreSQL API, adding three methods to the connection object: >> >> prepare_transaction(xid) - prepare the transaction, using the given >> ID. This closes off the transaction, allowing a new one to be started >> (if needed). >> >> commit_prepared(xid) - commit a previously prepared transaction . Must >> be called outside of a transaction (i.e. no execute() calls since the >> last commit/rollback). >> >> rollback_prepared(xid) - rollback a previously prepared transaction. >> >> The idea being that this should be enough to plug psycopg2 into a >> transaction manager such as Zope's transaction module or similar. > > Zope doesn't require any specific additional APIs to hook > the database module into its transaction mechanism. While > you do need a wrapper (the Zope DA), the three methods used > by the Zope TM easily map onto the standard .commit() and > .rollback() methods of the database interface. To meet the atomicity requirement of ACID, Zope does need additional APIs, to expose hooks into its two-phase mechanism. If you only have access to the conventional .commit() and .rollback() methods of the database interface, you cannot handle this case: 1. You have made a change to the ZODB and to a record in the PostgreSQL database, which are part of a single transaction. 2. The Zope TM invokes the .commit() method of the PostgreSQL interface. 3. Then Zope TM invokes the .commit() method of the ZODB interface, which fails for some reason (say a WriteConflict) -- now it is too late to rollback the PostgreSQL commit and you're hosed. >> kinterbasdb implements a Connection.prepare() method, which performs >> the first phase and causes a subsequent commit() or rollback() to >> complete that transaction. Transaction identifiers are not exposed by >> the API. >> >> pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to >> add any explicit API to the connection object, but DCOracle2 does have >> an incompatible prepare() method used for prepared statements. > > pymqi is a wrapper for IBM MQSeries which can act as XA-compliant > two-phase commit transaction manager (TM). For this to work, the underlying > database interface has to be compatible to the XA specification, > which is essentially a C interface used directly by the TM. > > Note that XA implements transactions completely outside the > normal scope of the Python database module, ie. you may not > call .commit() or .rollback() on the connection objects, but > instead have to register with the XA TM any action that > you plan to have as part of a two-phase commit transaction. > > BTW, I'm not sure whether you are interpreting the .prepare() correctly: > this only prepares a statement for later execution, it doesn't > do the first part of a two-phase commit which would be to save > the current transaction log and check whether it could potentially > be committed without problems. Which .prepare() are you referring to as possibly misinterpreted - that for his notes about kinterbasdb, pymqi or PostgreSQL? -Jeff From nand_rathi at yahoo.com Fri Jan 18 12:11:54 2008 From: nand_rathi at yahoo.com (Nand Rathi) Date: Fri, 18 Jan 2008 03:11:54 -0800 (PST) Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol Message-ID: <682465.98796.qm@web57007.mail.re3.yahoo.com> Hello All Greetings I see a current thread regarding 2PC protocol, but my requirement is little different. I need to write some python programs which will access 2 databases simultaneously (Oracle & Postgresql). I need to use 2PC to maintain the transaction integrity. Is there a python module available which can provide me the 2PC facility? The application doesn't have a need to use Zope though ;-( Can you please guide me appropriately? regards N ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From mal at egenix.com Fri Jan 18 12:29:27 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 18 Jan 2008 12:29:27 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <47908237.6010001@taupro.com> References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> Message-ID: <47908D97.1080304@egenix.com> On 2008-01-18 11:40, Jeff Rush wrote: > M.-A. Lemburg wrote: >> On 2008-01-18 09:37, James Henstridge wrote: >>> I've been looking at implementing two phase commit for the psycopg2 >>> driver for PostgreSQL. It was suggested that I bring up the issue on >>> this list to see if there were any suggestions about what form the API >>> should take. > > Thank you! I've been wanting a two-phase commit API for a long time, to use > with Zope myself. > > >>> The API from the initial patch I produced stuck pretty close to the >>> PostgreSQL API, adding three methods to the connection object: >>> >>> prepare_transaction(xid) - prepare the transaction, using the given >>> ID. This closes off the transaction, allowing a new one to be started >>> (if needed). >>> >>> commit_prepared(xid) - commit a previously prepared transaction . Must >>> be called outside of a transaction (i.e. no execute() calls since the >>> last commit/rollback). >>> >>> rollback_prepared(xid) - rollback a previously prepared transaction. >>> >>> The idea being that this should be enough to plug psycopg2 into a >>> transaction manager such as Zope's transaction module or similar. >> Zope doesn't require any specific additional APIs to hook >> the database module into its transaction mechanism. While >> you do need a wrapper (the Zope DA), the three methods used >> by the Zope TM easily map onto the standard .commit() and >> .rollback() methods of the database interface. > > To meet the atomicity requirement of ACID, Zope does need additional APIs, to > expose hooks into its two-phase mechanism. If you only have access to the > conventional .commit() and .rollback() methods of the database interface, you > cannot handle this case: > > 1. You have made a change to the ZODB and to a record in the PostgreSQL > database, which are part of a single transaction. > > 2. The Zope TM invokes the .commit() method of the PostgreSQL interface. > > 3. Then Zope TM invokes the .commit() method of the ZODB interface, which > fails for some reason (say a WriteConflict) -- now it is too late to > rollback the PostgreSQL commit and you're hosed. While this would seem desirable, it is not how the Zope TM works. Phase 1 is implemented by doing a vote on the success of the transaction. Phase 2 then finishes or aborts the transaction depending on the vote. If something fails in phase 2, there's no guarantee that partial commits can be undone. The .commit()/.rollback() calls on the database interface would be implemented in the phase 2 part. To avoid your scenario, the ZODB would have to detect the conflict during phase 1 (ie. the voting phase). >>> kinterbasdb implements a Connection.prepare() method, which performs >>> the first phase and causes a subsequent commit() or rollback() to >>> complete that transaction. Transaction identifiers are not exposed by >>> the API. >>> >>> pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to >>> add any explicit API to the connection object, but DCOracle2 does have >>> an incompatible prepare() method used for prepared statements. >> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant >> two-phase commit transaction manager (TM). For this to work, the underlying >> database interface has to be compatible to the XA specification, >> which is essentially a C interface used directly by the TM. >> >> Note that XA implements transactions completely outside the >> normal scope of the Python database module, ie. you may not >> call .commit() or .rollback() on the connection objects, but >> instead have to register with the XA TM any action that >> you plan to have as part of a two-phase commit transaction. >> >> BTW, I'm not sure whether you are interpreting the .prepare() correctly: >> this only prepares a statement for later execution, it doesn't >> do the first part of a two-phase commit which would be to save >> the current transaction log and check whether it could potentially >> be committed without problems. > > Which .prepare() are you referring to as possibly misinterpreted - that for > his notes about kinterbasdb, pymqi or PostgreSQL? That of DCOracle2. The cursor.prepare() method is a DB-API extension that we've discussed a couple of times. Its intent it to prepare the execution of an SQL command on the cursor, ie. parse it, prepare the access path on the server and possibly fetch the parameter type information from the server as well. Using the .prepare() method you can detect errors in the SQL before actually running the statement with data. It also allows setting up a pool of cursor objects that are intended to each only execute one type of SQL command, e.g. to enhance performance for recurring SQL commands. I'm not aware of any discussion on a connection.prepare() method. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Fri Jan 18 12:29:28 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 20:29:28 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <479071E5.7000000@egenix.com> References: <479071E5.7000000@egenix.com> Message-ID: On 18/01/2008, M.-A. Lemburg wrote: > On 2008-01-18 09:37, James Henstridge wrote: > > Hello, > > > > I've been looking at implementing two phase commit for the psycopg2 > > driver for PostgreSQL. It was suggested that I bring up the issue on > > this list to see if there were any suggestions about what form the API > > should take. > > > > The API from the initial patch I produced stuck pretty close to the > > PostgreSQL API, adding three methods to the connection object: > > > > prepare_transaction(xid) - prepare the transaction, using the given > > ID. This closes off the transaction, allowing a new one to be started > > (if needed). > > > > commit_prepared(xid) - commit a previously prepared transaction . Must > > be called outside of a transaction (i.e. no execute() calls since the > > last commit/rollback). > > > > rollback_prepared(xid) - rollback a previously prepared transaction. > > > > The idea being that this should be enough to plug psycopg2 into a > > transaction manager such as Zope's transaction module or similar. > > Zope doesn't require any specific additional APIs to hook > the database module into its transaction mechanism. While > you do need a wrapper (the Zope DA), the three methods used > by the Zope TM easily map onto the standard .commit() and > rollback() methods of the database interface. I already have an understanding of how the Zope transaction manager works. The point is: 1. The database adapter needs to provide some API for use by a Zope DataManager. This API needs to co-exist with the standard DB-API transaction handling. 2. If the database adapter is going to provide an API used to implement two-phase commit, does it make sense to standardise such an API across different database adaptrers? This leads on to the question I asked in my previous email: > > I understand that this API might not be implementable by other > > database adapters, which brings up the question: what would be a good > > API? > >>From a quick search, I found two other adapters implementing 2pc both > > with incompatible APIs: > > > > kinterbasdb implements a Connection.prepare() method, which performs > > the first phase and causes a subsequent commit() or rollback() to > > complete that transaction. Transaction identifiers are not exposed by > > the API. > > > > pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to > > add any explicit API to the connection object, but DCOracle2 does have > > an incompatible prepare() method used for prepared statements. > > pymqi is a wrapper for IBM MQSeries which can act as XA-compliant > two-phase commit transaction manager (TM). For this to work, the underlying > database interface has to be compatible to the XA specification, > which is essentially a C interface used directly by the TM. > > Note that XA implements transactions completely outside the > normal scope of the Python database module, ie. you may not > call .commit() or .rollback() on the connection objects, but > instead have to register with the XA TM any action that > you plan to have as part of a two-phase commit transaction. Yep. I am not sure how easy it would be to expose a Python level two-phase commit API for DCOracle2 -- I just brought it up as an example of a database adapter that people are using with a transaction manager (albeit at the C level). > BTW, I'm not sure whether you are interpreting the .prepare() correctly: > this only prepares a statement for later execution, it doesn't > do the first part of a two-phase commit which would be to save > the current transaction log and check whether it could potentially > be committed without problems. I guess I was a bit unclear. When I said that DCOracle had an incompatible Connection.prepare() method, I meant that it is incompatible with respsect to kinterbasdb's Connection.prepare(). Therefore, standardising a prepare() method for use in two-phase commit would be problematic. > > So is there any recommendations for what a two-phase commit API should > > look like? > > It depends a lot on what you're trying to solve. > > In general, you usually have to adjust to an existing > transaction manager and that then defines the interface > to use. The are two industry standards for this: XA (X/Open) > and DTC (MS). I realise that tying a database adapter into a transaction manager will often involve some level of database-specific code. I just wonder if there is enough commonality to justify some level of standardisation. It seems silly for everyone to do things differently for no good reason. James. From james at jamesh.id.au Fri Jan 18 12:36:44 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 20:36:44 +0900 Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol In-Reply-To: <682465.98796.qm@web57007.mail.re3.yahoo.com> References: <682465.98796.qm@web57007.mail.re3.yahoo.com> Message-ID: On 18/01/2008, Nand Rathi wrote: > Hello All > > Greetings > > I see a current thread regarding 2PC protocol, but my > requirement is little different. > > I need to write some python programs which will access > 2 databases simultaneously (Oracle & Postgresql). I > need to use 2PC to maintain the transaction integrity. If you are only accessing two databases, you only need 2PC support on one of them. The protocol would be something like this: 1. Prepare the transaction for 2PC on the first connection. 2. If the transaction could not be prepared, rollback both connection. 3. If the transaction could be prepared, commit the second connection. 4. If the second connection committed successfully, complete the transaction on the first connection 5. If the second connection failed to commit, rollback the prepared transaction on the second connection. The patch I did for psycopg2 should let you perform 2PC, so could be used as above whether or not the Oracle adapter you are using supports it. James. From james at jamesh.id.au Fri Jan 18 13:05:39 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 21:05:39 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <47908D97.1080304@egenix.com> References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> Message-ID: On 18/01/2008, M.-A. Lemburg wrote: > While this would seem desirable, it is not how the Zope TM > works. > > Phase 1 is implemented by doing a vote on the success > of the transaction. Phase 2 then finishes or aborts the transaction > depending on the vote. > > If something fails in phase 2, there's no guarantee that partial > commits can be undone. > > The .commit()/.rollback() calls on the database interface would > be implemented in the phase 2 part. > > To avoid your scenario, the ZODB would have to detect the conflict > during phase 1 (ie. the voting phase). Looking at the IDataManager API, it looks like it looks like the correct way to implement two phase commit would be: 1. if tpc_begin() is called, note that two-phase commit is being used. 2. in commit(), simply prepare the transaction if the two-phase commit flag is set, rather than actually committing. If this fails, the transaction obviously fails. 3. make tpc_vote() a no-op. 4. tpc_finish() commits the prepared transaction 5. abort() and tpc_abort() roll back the prepared transaction (if one was prepared). James. From mal at egenix.com Fri Jan 18 13:20:32 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 18 Jan 2008 13:20:32 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: References: <479071E5.7000000@egenix.com> Message-ID: <47909990.3080804@egenix.com> On 2008-01-18 12:29, James Henstridge wrote: > On 18/01/2008, M.-A. Lemburg wrote: >> On 2008-01-18 09:37, James Henstridge wrote: >>> Hello, >>> >>> I've been looking at implementing two phase commit for the psycopg2 >>> driver for PostgreSQL. It was suggested that I bring up the issue on >>> this list to see if there were any suggestions about what form the API >>> should take. >>> >>> The API from the initial patch I produced stuck pretty close to the >>> PostgreSQL API, adding three methods to the connection object: >>> >>> prepare_transaction(xid) - prepare the transaction, using the given >>> ID. This closes off the transaction, allowing a new one to be started >>> (if needed). >>> >>> commit_prepared(xid) - commit a previously prepared transaction . Must >>> be called outside of a transaction (i.e. no execute() calls since the >>> last commit/rollback). >>> >>> rollback_prepared(xid) - rollback a previously prepared transaction. >>> >>> The idea being that this should be enough to plug psycopg2 into a >>> transaction manager such as Zope's transaction module or similar. >> Zope doesn't require any specific additional APIs to hook >> the database module into its transaction mechanism. While >> you do need a wrapper (the Zope DA), the three methods used >> by the Zope TM easily map onto the standard .commit() and >> rollback() methods of the database interface. > > I already have an understanding of how the Zope transaction manager > works. The point is: > > 1. The database adapter needs to provide some API for use by a Zope > DataManager. This API needs to co-exist with the standard DB-API > transaction handling. > > 2. If the database adapter is going to provide an API used to > implement two-phase commit, does it make sense to standardise such an > API across different database adaptrers? This leads on to the > question I asked in my previous email: > >>> I understand that this API might not be implementable by other >>> database adapters, which brings up the question: what would be a good >>> API? > > >>> >From a quick search, I found two other adapters implementing 2pc both >>> with incompatible APIs: >>> >>> kinterbasdb implements a Connection.prepare() method, which performs >>> the first phase and causes a subsequent commit() or rollback() to >>> complete that transaction. Transaction identifiers are not exposed by >>> the API. >>> >>> pymqi provides a patch to the DCOracle2 adapter. It doesn't seem to >>> add any explicit API to the connection object, but DCOracle2 does have >>> an incompatible prepare() method used for prepared statements. >> pymqi is a wrapper for IBM MQSeries which can act as XA-compliant >> two-phase commit transaction manager (TM). For this to work, the underlying >> database interface has to be compatible to the XA specification, >> which is essentially a C interface used directly by the TM. >> >> Note that XA implements transactions completely outside the >> normal scope of the Python database module, ie. you may not >> call .commit() or .rollback() on the connection objects, but >> instead have to register with the XA TM any action that >> you plan to have as part of a two-phase commit transaction. > > Yep. I am not sure how easy it would be to expose a Python level > two-phase commit API for DCOracle2 -- I just brought it up as an > example of a database adapter that people are using with a transaction > manager (albeit at the C level). > >> BTW, I'm not sure whether you are interpreting the .prepare() correctly: >> this only prepares a statement for later execution, it doesn't >> do the first part of a two-phase commit which would be to save >> the current transaction log and check whether it could potentially >> be committed without problems. > > I guess I was a bit unclear. When I said that DCOracle had an > incompatible Connection.prepare() method, I meant that it is > incompatible with respsect to kinterbasdb's Connection.prepare(). > > Therefore, standardising a prepare() method for use in two-phase > commit would be problematic. Thanks for the clarification. I was thinking of the cursor.prepare() method. >>> So is there any recommendations for what a two-phase commit API should >>> look like? >> It depends a lot on what you're trying to solve. >> >> In general, you usually have to adjust to an existing >> transaction manager and that then defines the interface >> to use. The are two industry standards for this: XA (X/Open) >> and DTC (MS). > > I realise that tying a database adapter into a transaction manager > will often involve some level of database-specific code. > > I just wonder if there is enough commonality to justify some level of > standardisation. It seems silly for everyone to do things differently > for no good reason. Agreed, but the need for such interfaces only comes up if you plan to implement the transaction manager (TM) itself in Python and need to use the database module as resource manager (RM). I don't know how this would work with DTC (have never used it, but it appears to be similar to XA). With XA, the RM has to provide a C struct defining a set of C function hooks (the XA switch). This is then used by the TM to implement two-phase commits. Now, if a database provides such an XA interface, this could also be made available to a Python TM. You'd then have to open the connection via this XA interface rather than the standard connection constructor (or pass in a parameter to this constructor to make it use the XA open instead of the RM open). Perhaps we could piggy-back the XA-style interface onto the connection interface and its constructor and turn it into an XA DB-API extension ?! XA Spec: http://www.opengroup.org/products/publications/catalog/c193.htm -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Fri Jan 18 13:28:17 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 18 Jan 2008 13:28:17 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> Message-ID: <47909B61.8080501@egenix.com> On 2008-01-18 13:05, James Henstridge wrote: > On 18/01/2008, M.-A. Lemburg wrote: >> While this would seem desirable, it is not how the Zope TM >> works. >> >> Phase 1 is implemented by doing a vote on the success >> of the transaction. Phase 2 then finishes or aborts the transaction >> depending on the vote. >> >> If something fails in phase 2, there's no guarantee that partial >> commits can be undone. >> >> The .commit()/.rollback() calls on the database interface would >> be implemented in the phase 2 part. >> >> To avoid your scenario, the ZODB would have to detect the conflict >> during phase 1 (ie. the voting phase). > > Looking at the IDataManager API, it looks like it looks like the > correct way to implement two phase commit would be: > > 1. if tpc_begin() is called, note that two-phase commit is being used. > 2. in commit(), simply prepare the transaction if the two-phase commit > flag is set, rather than actually committing. If this fails, the > transaction obviously fails. > 3. make tpc_vote() a no-op. > 4. tpc_finish() commits the prepared transaction > 5. abort() and tpc_abort() roll back the prepared transaction (if one > was prepared). Agreed, but at least for Zope database adapters, that's not what's implemented (have a look at ZRDB/TM.py). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From stuart at stuartbishop.net Fri Jan 18 14:07:24 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Fri, 18 Jan 2008 20:07:24 +0700 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: References: Message-ID: <4790A48C.2030204@stuartbishop.net> James Henstridge wrote: > So is there any recommendations for what a two-phase commit API should > look like? Here are the three obvious possibilities. The first is what you already have. The other two also allow access to all of PostgreSQL's two phase commit API and are functionally identical. All would work fine for integrating with the transaction managers I'm familiar with (Z2, Z3, Storm). The difference is just spelling. Any opinions? conn = connect([...]) [... do work ...] try: xid = conn.prepare_transaction('xid%f' % random()) [... prepare other participants ...] except: conn.rollback_prepared(xid) else: conn.commit_prepared(xid) conn = connect([...]) [... do work ...] try: trans = conn.prepare_transaction('xid%f' % random()) [... prepare other participants ...] except: trans.rollback() else: trans.commit() conn = connect([...]) [... do work ...] try: trans = PreparedTransaction(con, 'xid%f' % random()) [... prepare other participants ...] except: trans.rollback() else: trans.commit() -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/ccb4b04a/attachment.pgp From stuart at stuartbishop.net Fri Jan 18 14:20:37 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Fri, 18 Jan 2008 20:20:37 +0700 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <47909990.3080804@egenix.com> References: <479071E5.7000000@egenix.com> <47909990.3080804@egenix.com> Message-ID: <4790A7A5.5060104@stuartbishop.net> M.-A. Lemburg wrote: > Now, if a database provides such an XA interface, this could > also be made available to a Python TM. You'd then have to > open the connection via this XA interface rather than the > standard connection constructor (or pass in a parameter > to this constructor to make it use the XA open instead of > the RM open). > > Perhaps we could piggy-back the XA-style interface onto > the connection interface and its constructor and turn it > into an XA DB-API extension ?! If that has more than just 'prepare_transaction', 'commit_transaction' and 'rollback_transaction' it has no place in the DB-API IMO. These three actions are the entirety of what PostgreSQL provides and are the building blocks you need to build anything more complex (including XA). We don't need driver authors to build a transaction manager. We just need driver authors to provide the building blocks for DB-API connections to be integrated with transaction managers. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/4d4a31cd/attachment.pgp From stuart at stuartbishop.net Fri Jan 18 14:33:19 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Fri, 18 Jan 2008 20:33:19 +0700 Subject: [DB-SIG] Need help regarding XA Compliant 2PC protocol In-Reply-To: References: <682465.98796.qm@web57007.mail.re3.yahoo.com> Message-ID: <4790AA9F.5000004@stuartbishop.net> James Henstridge wrote: > The patch I did for psycopg2 should let you perform 2PC, so could be > used as above whether or not the Oracle adapter you are using supports > it. You can also do this right now if you don't mind it being ugly: con = psycopg.connect('') [... do stuff ...] xid = 'xid%f' % random() cur = con.cursor() cur.execute('PREPARE TRANSACTION %s', [xid]) try: [... commit oracle ...] except: cur.execute('ROLLBACK PREPARED %s', [xid]) else: cur.execute('COMMIT PREPARED %s', [xid]) You might be able to do the same trick with Oracle, allowing you to handle more than 2 Oracle connections safely. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080118/5d25acfd/attachment.pgp From mal at egenix.com Fri Jan 18 14:44:02 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 18 Jan 2008 14:44:02 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <4790A7A5.5060104@stuartbishop.net> References: <479071E5.7000000@egenix.com> <47909990.3080804@egenix.com> <4790A7A5.5060104@stuartbishop.net> Message-ID: <4790AD22.9060105@egenix.com> On 2008-01-18 14:20, Stuart Bishop wrote: > M.-A. Lemburg wrote: > >> Now, if a database provides such an XA interface, this could >> also be made available to a Python TM. You'd then have to >> open the connection via this XA interface rather than the >> standard connection constructor (or pass in a parameter >> to this constructor to make it use the XA open instead of >> the RM open). >> >> Perhaps we could piggy-back the XA-style interface onto >> the connection interface and its constructor and turn it >> into an XA DB-API extension ?! > > If that has more than just 'prepare_transaction', 'commit_transaction' and > 'rollback_transaction' it has no place in the DB-API IMO. These three > actions are the entirety of what PostgreSQL provides and are the building > blocks you need to build anything more complex (including XA). We don't need > driver authors to build a transaction manager. We just need driver authors > to provide the building blocks for DB-API connections to be integrated with > transaction managers. The XA API is a bit more complex than just the three APIs you mention (with "prepare_transaction" meaning "prepare to commit a transaction"): http://www.opengroup.org/onlinepubs/009680699/toc.pdf I'm not suggesting that we need to have all those APIs, but the essential APIs need to be present, ie. you need to be able to: * put a connection under TM control or associate with a TM transaction (xa_open/xa_start/ax_reg) * prepare to commit a TM transaction (xa_prepare) * finally commit a TM transaction (xa_commit) * finally rollback a TM transaction (xa_rollback) * release the connection from TM control (xa_close/xa_end/ax_unreg) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Fri Jan 18 15:11:19 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 23:11:19 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <47909B61.8080501@egenix.com> References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> <47909B61.8080501@egenix.com> Message-ID: On 18/01/2008, M.-A. Lemburg wrote: > > Looking at the IDataManager API, it looks like it looks like the > > correct way to implement two phase commit would be: > > > > 1. if tpc_begin() is called, note that two-phase commit is being used. > > 2. in commit(), simply prepare the transaction if the two-phase commit > > flag is set, rather than actually committing. If this fails, the > > transaction obviously fails. > > 3. make tpc_vote() a no-op. > > 4. tpc_finish() commits the prepared transaction > > 5. abort() and tpc_abort() roll back the prepared transaction (if one > > was prepared). > > Agreed, but at least for Zope database adapters, that's not what's > implemented (have a look at ZRDB/TM.py). This looks pretty much the same as the Zope 3 zope.app.rdb case: the default DataManager implementation provided by Zope doesn't support two-phase commit, but it is possible for an adapter to provide its own DataManager implementation. This isn't too surprising when you consider that there is no standard two-phase commit API for database adapters. James. From james at jamesh.id.au Fri Jan 18 15:36:51 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 18 Jan 2008 23:36:51 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <4790AD22.9060105@egenix.com> References: <479071E5.7000000@egenix.com> <47909990.3080804@egenix.com> <4790A7A5.5060104@stuartbishop.net> <4790AD22.9060105@egenix.com> Message-ID: On 18/01/2008, M.-A. Lemburg wrote: > > If that has more than just 'prepare_transaction', 'commit_transaction' and > > 'rollback_transaction' it has no place in the DB-API IMO. These three > > actions are the entirety of what PostgreSQL provides and are the building > > blocks you need to build anything more complex (including XA). We don't need > > driver authors to build a transaction manager. We just need driver authors > > to provide the building blocks for DB-API connections to be integrated with > > transaction managers. > > The XA API is a bit more complex than just the three APIs you > mention (with "prepare_transaction" meaning "prepare to commit > a transaction"): [snip] It is worth noting that the JDBC driver implements the Java variant of XA on top of the three primitives Stuart mentioned. The remaining parts are mainly policy of when to use those primitives. As another data point on 2PC APIs, I found that the cx_Oracle driver provides such an API: http://cx-oracle.sourceforge.net/html/connobj.html It is similar to kinterbasdb's one in that it uses prepare()/commit()/rollback() methods on the connection, but it also sounds like it requires you to call a begin() method to start a transaction. James. From dieter at handshake.de Fri Jan 18 20:32:37 2008 From: dieter at handshake.de (Dieter Maurer) Date: Fri, 18 Jan 2008 20:32:37 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <47908D97.1080304@egenix.com> References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> Message-ID: <18320.65237.930087.237768@gargle.gargle.HOWL> M.-A. Lemburg wrote at 2008-1-18 12:29 +0100: > ... >While this would seem desirable, it is not how the Zope TM >works. > >Phase 1 is implemented by doing a vote on the success >of the transaction. Phase 2 then finishes or aborts the transaction >depending on the vote. > >If something fails in phase 2, there's no guarantee that partial >commits can be undone. > >The .commit()/.rollback() calls on the database interface would >be implemented in the phase 2 part. > >To avoid your scenario, the ZODB would have to detect the conflict >during phase 1 (ie. the voting phase). It does this indeed. And it assumes that a resource manager accepts a vote only when it can garantee that the subsequent "commit" will succeed (and does not fail). A resource manager needs to expose both a "vote" (with the above garantee) and a "commit" in order to be a first class participant of Zope's transaction system. Relational database interfaces often lack the equivalent of a "vote". -- Dieter From james at jamesh.id.au Mon Jan 21 06:00:42 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 14:00:42 +0900 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: <18320.65237.930087.237768@gargle.gargle.HOWL> References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> <18320.65237.930087.237768@gargle.gargle.HOWL> Message-ID: On 19/01/2008, Dieter Maurer wrote: > It does this indeed. > > And it assumes that a resource manager accepts a vote only > when it can garantee that the subsequent "commit" will succeed (and > does not fail). > > A resource manager needs to expose both a "vote" (with the above garantee) > and a "commit" in order to be a first class participant of > Zope's transaction system. > > Relational database interfaces often lack the equivalent of a "vote". I'd disagree with this description. From the Zope transaction documentation, the order of methods is: tpc_begin commit tpc_vote (tpc_finish | tpc_abort) >From the descriptions of the various methods, a database adapter supporting 2PC would prepare the transaction at commit(), and commit or rollback that transaction in tpc_finish or tpc_abort respectively. After preparing the transaction, the transaction should be committable under normal circumstances, so it would have no reason to vote no as part of tpc_vote(). I disagree that the lack of a tpc_vote() method makes the database adapter a second class citizen: it simply reflects the fact that the adapter makes up its mind at the commit() stage independent of what other data managers do. James. From james at jamesh.id.au Mon Jan 21 11:08:28 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 19:08:28 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) Message-ID: On 18/01/2008, James Henstridge wrote: > So is there any recommendations for what a two-phase commit API should > look like? I did a bit of investigation into a few databases, and came up with a proposal for an extension to the DB-API. I know that there are a few incomplete portions of the proposal, so I'd appreciate feedback. If you have knowledge of a database not covered here, please comment on whether the proposed API would be workable in that context. Re: the confusion between "prepared transactions" vs. "prepared statements" support, this probably won't conflict since the prepared statement extensions I saw used Cursor.prepare() rather than Connection.prepare(). If it is a problem though, the proposal could be modified to use prepare_transaction() or similar. James. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: two-phase-commit.txt Url: http://mail.python.org/pipermail/db-sig/attachments/20080121/2ffb5544/attachment.txt From fog at initd.org Mon Jan 21 11:28:55 2008 From: fog at initd.org (Federico Di Gregorio) Date: Mon, 21 Jan 2008 11:28:55 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: Message-ID: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> I agree with your analisys, I'll add some comments about the proposal below. Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto: > 1. Add a Connection.begin(...) method that explicitly starts a > transaction. Some argument (possibly the transaction ID) causes > the transaction to use two-phase commit. May raise > NotSupportedError if two-phase commit is not supported. DBAPI always had implicit transaction begin (for backends supporting transactions) and adding an explicit begin() method would just add confusion onto the user. "Should I always call begin()? Or just when I want to start a two-phase?". I'd better like the two-phase begin method named otherwise. Let's call it xa_begin() in this discussion. > 2. Add a Connection.prepare() method that peforms the first stage of > two-phase commit. May raise NotSupportedError if two-phase commit > is not supported, or the transaction was not started in two-phase > mode. > Ok. (Should be named accordingly with the begin method.) > 3. Calling commit() or rollback() on the connection after prepare() > performs the second stage of the commit. > Ok. > 4. Calling commit() or rollback() on the connection prior to > prepare() performs a one-phase commit or rollback. > IMHO, it should raise an error if the transaction was started for two-phase. Otherwise I don't see any reason for (1). > 5. Executing statements after prepare() but before commit() or > rollback() results in an error (ProgrammingError?) > Ok. > 6. Closing a connection with a prepared but uncommitted transaction > rolls back that transaction. > Stuart's comment on psycopg ML made me think about this one. Maybe we want an option added to xa_begin() to keep the prepared transaction open even if the connection drops. federico -- Federico Di Gregorio http://people.initd.org/fog DISCLAIMER. If I receive a message from you, you are agreeing that: 1. I am by definition, "the intended recipient". 2. All information in the email is mine to do with as I see fit and make such financial profit, political mileage, or good joke as it lends itself to. In particular, I may quote it on USENET or the WWW. 3. I may take the contents as representing the views of your company. 4. This overrides any disclaimer or statement of confidentiality that may be included on your message. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/d4c155eb/attachment.pgp From mal at egenix.com Mon Jan 21 11:58:49 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 21 Jan 2008 11:58:49 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> Message-ID: <47947AE9.7010101@egenix.com> On 2008-01-21 11:28, Federico Di Gregorio wrote: > I agree with your analisys, I'll add some comments about the proposal > below. > > Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto: >> 1. Add a Connection.begin(...) method that explicitly starts a >> transaction. Some argument (possibly the transaction ID) causes >> the transaction to use two-phase commit. May raise >> NotSupportedError if two-phase commit is not supported. > > DBAPI always had implicit transaction begin (for backends supporting > transactions) and adding an explicit begin() method would just add > confusion onto the user. "Should I always call begin()? Or just when I > want to start a two-phase?". I'd better like the two-phase begin method > named otherwise. Let's call it xa_begin() in this discussion. Agreed. I also think that we should prepend all of these methods with "xa_" or something similar: database backends may need to be to differentiate whether the user wants to e.g. commit in the context of a two-phase commit transaction or a regular one and the two-phase commit is also likely going to require an argument (the transaction id). Using a different set of methods would also make it clear to the reader of the code, that a two-phase commit transaction is happening (which does work a lot different from a one-phase one). >> 2. Add a Connection.prepare() method that peforms the first stage of >> two-phase commit. May raise NotSupportedError if two-phase commit >> is not supported, or the transaction was not started in two-phase >> mode. >> > Ok. (Should be named accordingly with the begin method.) .xa_prepare(xid) >> 3. Calling commit() or rollback() on the connection after prepare() >> performs the second stage of the commit. >> > Ok. .xa_commit(xid) and .xa_rollback(xid) >> 4. Calling commit() or rollback() on the connection prior to >> prepare() performs a one-phase commit or rollback. >> > IMHO, it should raise an error if the transaction was started for > two-phase. Otherwise I don't see any reason for (1). Agreed. They should raise an error. In fact, when operating in two-phase commit mode, I think using the one-phase methods .commit() and .rollback() should raise an error. Mixing the two is normally not a good idea and may very well result in an undefined state. >> 5. Executing statements after prepare() but before commit() or >> rollback() results in an error (ProgrammingError?) >> > Ok. Agreed. >> 6. Closing a connection with a prepared but uncommitted transaction >> rolls back that transaction. >> > Stuart's comment on psycopg ML made me think about this one. Maybe we > want an option added to xa_begin() to keep the prepared transaction open > even if the connection drops. A connection drop should always trigger an implicit rollback on the server side, so I'm not sure how and where you'd keep the required state to continue processing the transaction in case the connection is reestablished. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Mon Jan 21 12:09:12 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 20:09:12 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> Message-ID: On 21/01/2008, Federico Di Gregorio wrote: > I agree with your analisys, I'll add some comments about the proposal > below. > > Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto: > > 1. Add a Connection.begin(...) method that explicitly starts a > > transaction. Some argument (possibly the transaction ID) causes > > the transaction to use two-phase commit. May raise > > NotSupportedError if two-phase commit is not supported. > > DBAPI always had implicit transaction begin (for backends supporting > transactions) and adding an explicit begin() method would just add > confusion onto the user. "Should I always call begin()? Or just when I > want to start a two-phase?". I'd better like the two-phase begin method > named otherwise. Let's call it xa_begin() in this discussion. I don't have a strong opinion here. I used begin() in the proposal because the method is currently available in many adapters to explicitly start a transaction (even though they'll implicitly start a transaction otherwise). Extending the method seemed easier, and is what cx_Oracle did. > > 2. Add a Connection.prepare() method that peforms the first stage of > > two-phase commit. May raise NotSupportedError if two-phase commit > > is not supported, or the transaction was not started in two-phase > > mode. > > > Ok. (Should be named accordingly with the begin method.) I used prepare() in the proposal because that's what cx_Oracle and kinterbasdb are already doing. > > 3. Calling commit() or rollback() on the connection after prepare() > > performs the second stage of the commit. > > > Ok. > > > 4. Calling commit() or rollback() on the connection prior to > > prepare() performs a one-phase commit or rollback. > > > IMHO, it should raise an error if the transaction was started for > two-phase. Otherwise I don't see any reason for (1). I disagree here. If a problem is detected early in the transaction, calling prepare() before rollback() on the other members of the global transaction is a waste of effort. As for commit(), the transaction manager can use one-phase commit for the last resource without integrity problems. I don't see much value in preventing this optimisation. > > 5. Executing statements after prepare() but before commit() or > > rollback() results in an error (ProgrammingError?) > > > Ok. > > > 6. Closing a connection with a prepared but uncommitted transaction > > rolls back that transaction. > > > Stuart's comment on psycopg ML made me think about this one. Maybe we > want an option added to xa_begin() to keep the prepared transaction open > even if the connection drops. Perhaps. I haven't really thought much about the recovery side of the API. James. From fog at initd.org Mon Jan 21 12:16:17 2008 From: fog at initd.org (Federico Di Gregorio) Date: Mon, 21 Jan 2008 12:16:17 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> Message-ID: <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> Il giorno lun, 21/01/2008 alle 20.09 +0900, James Henstridge ha scritto: > > IMHO, it should raise an error if the transaction was started for > > two-phase. Otherwise I don't see any reason for (1). > > I disagree here. If a problem is detected early in the transaction, > calling prepare() before rollback() on the other members of the global > transaction is a waste of effort. > > As for commit(), the transaction manager can use one-phase commit for > the last resource without integrity problems. I don't see much value > in preventing this optimisation. I agree on rollback(), not on commit(). If the transaction manager wants to use one-phase it should do that explicitly. Allowing to call commit on a two-phase transaction without first preparing it is prone to errors and can lead to subtle errors like depending on it creating a "standard" transaction on some backends and not on others. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org When people say things are a lot more complicated than that, they means they're getting worried that they won't like the truth. -- Granny Weatherwax -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/8661bd0d/attachment.pgp From james at jamesh.id.au Mon Jan 21 12:31:38 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 20:31:38 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47947AE9.7010101@egenix.com> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> Message-ID: On 21/01/2008, M.-A. Lemburg wrote: > On 2008-01-21 11:28, Federico Di Gregorio wrote: > > I agree with your analisys, I'll add some comments about the proposal > > below. > > > > Il giorno lun, 21/01/2008 alle 19.08 +0900, James Henstridge ha scritto: > >> 1. Add a Connection.begin(...) method that explicitly starts a > >> transaction. Some argument (possibly the transaction ID) causes > >> the transaction to use two-phase commit. May raise > >> NotSupportedError if two-phase commit is not supported. > > > > DBAPI always had implicit transaction begin (for backends supporting > > transactions) and adding an explicit begin() method would just add > > confusion onto the user. "Should I always call begin()? Or just when I > > want to start a two-phase?". I'd better like the two-phase begin method > > named otherwise. Let's call it xa_begin() in this discussion. > > Agreed. > > I also think that we should prepend all of these methods with > "xa_" or something similar: database backends may need to be to > differentiate whether the user wants to e.g. commit in the context > of a two-phase commit transaction or a regular one and the two-phase > commit is also likely going to require an argument (the transaction id). > > Using a different set of methods would also make it clear to > the reader of the code, that a two-phase commit transaction is > happening (which does work a lot different from a one-phase one). I'm indifferent about this. I don't think using the same commit/rollback methods presents much confusion. > >> 2. Add a Connection.prepare() method that peforms the first stage of > >> two-phase commit. May raise NotSupportedError if two-phase commit > >> is not supported, or the transaction was not started in two-phase > >> mode. > >> > > Ok. (Should be named accordingly with the begin method.) > > xa_prepare(xid) In what cases would you pass a different xid to xa_prepare() vs. what was passed to xa_begin()? If not, then I'd leave the argument out: I've already told the connection what the transaction ID is once already. > >> 3. Calling commit() or rollback() on the connection after prepare() > >> performs the second stage of the commit. > >> > > Ok. > > xa_commit(xid) and .xa_rollback(xid) Having these arguments would be quite useful for the recovery use-case. I think it'd be useful to be able to use the methods without an argument to operate on the current transaction too though. > >> 4. Calling commit() or rollback() on the connection prior to > >> prepare() performs a one-phase commit or rollback. > >> > > IMHO, it should raise an error if the transaction was started for > > two-phase. Otherwise I don't see any reason for (1). > > Agreed. They should raise an error. > > In fact, when operating in two-phase commit mode, I think > using the one-phase methods .commit() and .rollback() should > raise an error. Mixing the two is normally not a good idea and > may very well result in an undefined state. If we have separate rollback vs. xa_rollback, then sure. But some rollback method should be allowed before preparing the transaction. The same goes for committing. > >> 5. Executing statements after prepare() but before commit() or > >> rollback() results in an error (ProgrammingError?) > >> > > Ok. > > Agreed. > > >> 6. Closing a connection with a prepared but uncommitted transaction > >> rolls back that transaction. > >> > > Stuart's comment on psycopg ML made me think about this one. Maybe we > > want an option added to xa_begin() to keep the prepared transaction open > > even if the connection drops. > > A connection drop should always trigger an implicit rollback on the > server side, so I'm not sure how and where you'd keep the required > state to continue processing the transaction in case the connection > is reestablished. Uncommitted prepared transactions survive the connection in PostgreSQL and can be committed from another connection. Many 2PC-supporting databases provide some way of listing existing transactions (e.g. MySQL's "XA RECOVER" statement), so I doubt PostgreSQL is unique here. At a minimum it'd be helpful to emit a warning in this case. James. From stuart at stuartbishop.net Mon Jan 21 12:31:48 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon, 21 Jan 2008 18:31:48 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47947AE9.7010101@egenix.com> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> Message-ID: <479482A4.7000504@stuartbishop.net> M.-A. Lemburg wrote: > A connection drop should always trigger an implicit rollback on the > server side, so I'm not sure how and where you'd keep the required > state to continue processing the transaction in case the connection > is reestablished. With PostgreSQL, when you PREPARE TRANSACTION all state is flushed to disk. If your network drops before you can commit or even if your server catches fire you can still reconnect later and commit the transaction (provided your disks survive). As an example, lets say you are dealing with three data stores and an exception is raised in the second phase whilst committing the 2nd data store. If the transaction on the 3rd data store is rolled back then you can only recover by somehow rolling back the transaction on the 1st and maybe 2nd data store. Given this is probably a multi user environment this may well involve data loss. If the transaction on the 3rd data store is not rolled back, then you can recover if the problem was transient by simply retrying the outstanding commits once the network glitch or whatever has been fixed. All you need are the transaction ids you used (and why meaningful transaction ids can make your life easier at 2am). -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/090adebf/attachment.pgp From james at jamesh.id.au Mon Jan 21 12:35:43 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 20:35:43 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> Message-ID: On 21/01/2008, Federico Di Gregorio wrote: > > Il giorno lun, 21/01/2008 alle 20.09 +0900, James Henstridge ha scritto: > > > IMHO, it should raise an error if the transaction was started for > > > two-phase. Otherwise I don't see any reason for (1). > > > > I disagree here. If a problem is detected early in the transaction, > > calling prepare() before rollback() on the other members of the global > > transaction is a waste of effort. > > > > As for commit(), the transaction manager can use one-phase commit for > > the last resource without integrity problems. I don't see much value > > in preventing this optimisation. > > I agree on rollback(), not on commit(). If the transaction manager wants > to use one-phase it should do that explicitly. Allowing to call commit > on a two-phase transaction without first preparing it is prone to errors > and can lead to subtle errors like depending on it creating a "standard" > transaction on some backends and not on others. MySQL appears to have a special API for performing a one-phase commit of an XA transaction: XA COMMIT xid ONE PHASE Perhaps an argument to xa_commit() would be appropriate here? connection.xa_commit(onephase=True) Without the argument, the commit would be considered to be a ProgrammingError. That would reduce the chance of programmer error leading to data corruption. James. From fog at initd.org Mon Jan 21 12:53:05 2008 From: fog at initd.org (Federico Di Gregorio) Date: Mon, 21 Jan 2008 12:53:05 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> Message-ID: <1200916385.4685.48.camel@mila.office.dinunzioedigregorio> Il giorno lun, 21/01/2008 alle 20.35 +0900, James Henstridge ha scritto: > MySQL appears to have a special API for performing a one-phase commit > of an XA transaction: > > XA COMMIT xid ONE PHASE > > Perhaps an argument to xa_commit() would be appropriate here? > > connection.xa_commit(onephase=True) > > Without the argument, the commit would be considered to be a > ProgrammingError. That would reduce the chance of programmer error > leading to data corruption. Lets not make an API that has features useful on a single backend. I suppose the necessity for a one-phase commit in a two-phase transaction is rare. A simple API means early adoption by most of the adapters. > federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. -- D.E.Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/2e57d56e/attachment.pgp From mal at egenix.com Mon Jan 21 12:57:22 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 21 Jan 2008 12:57:22 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <479482A4.7000504@stuartbishop.net> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net> Message-ID: <479488A2.5000603@egenix.com> On 2008-01-21 12:31, Stuart Bishop wrote: > M.-A. Lemburg wrote: > >> A connection drop should always trigger an implicit rollback on the >> server side, so I'm not sure how and where you'd keep the required >> state to continue processing the transaction in case the connection >> is reestablished. > > With PostgreSQL, when you PREPARE TRANSACTION all state is flushed to disk. > If your network drops before you can commit or even if your server catches > fire you can still reconnect later and commit the transaction (provided your > disks survive). > > As an example, lets say you are dealing with three data stores and an > exception is raised in the second phase whilst committing the 2nd data store. > > If the transaction on the 3rd data store is rolled back then you can only > recover by somehow rolling back the transaction on the 1st and maybe 2nd > data store. Given this is probably a multi user environment this may well > involve data loss. > > If the transaction on the 3rd data store is not rolled back, then you can > recover if the problem was transient by simply retrying the outstanding > commits once the network glitch or whatever has been fixed. All you need are > the transaction ids you used (and why meaningful transaction ids can make > your life easier at 2am). Thanks for the explanations. I was actually thinking of the connection between the TM and the RM (the database backend). The typical behavior of a TM is to cancel the ongoing two-phase commit transaction if an RM becomes unavailable. However, I can see your point. If the data stays on the database server and can be addressed via the XID, then a dropped connection wouldn't hurt all that much. Then again: how do you tell the database to forget about the data stored for an XID ? XA has an xa_forget() API for this, but I'm not sure whether this is expected to also work across TM-RM reconnects or whether the TM is actually expected to retry the reconnect at all. Im MQ Series apps, the typical behavior would be to put the data back on the queue and retry the whole transaction at some later point. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Mon Jan 21 13:10:07 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 21:10:07 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <479488A2.5000603@egenix.com> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net> <479488A2.5000603@egenix.com> Message-ID: On 21/01/2008, M.-A. Lemburg wrote: > Thanks for the explanations. I was actually thinking of the > connection between the TM and the RM (the database backend). > The typical behavior of a TM is to cancel the ongoing > two-phase commit transaction if an RM becomes unavailable. Stuart's use case is if the RM dies during the second phase of the commit. It is an edge case, but then 2PC is all about edge cases :) If it happens before that point, then rolling back is appropriate. > However, I can see your point. If the data stays on the > database server and can be addressed via the XID, then a > dropped connection wouldn't hurt all that much. > Then again: how do you tell the database to forget about > the data stored for an XID ? You ask for the transaction to be rolled back (e.g. "ROLLBACK PREPARED xid" in PostgreSQL, and "XA ROLLBACK xid" for MySQL). James. From james at jamesh.id.au Mon Jan 21 13:12:17 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 21:12:17 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <1200916385.4685.48.camel@mila.office.dinunzioedigregorio> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> <1200916385.4685.48.camel@mila.office.dinunzioedigregorio> Message-ID: On 21/01/2008, Federico Di Gregorio wrote: > > Perhaps an argument to xa_commit() would be appropriate here? > > > > connection.xa_commit(onephase=True) > > > > Without the argument, the commit would be considered to be a > > ProgrammingError. That would reduce the chance of programmer error > > leading to data corruption. > > Lets not make an API that has features useful on a single backend. I > suppose the necessity for a one-phase commit in a two-phase transaction > is rare. A simple API means early adoption by most of the adapters. Well, Postgres lets you commit a 2PC transaction before preparing it too (after all, it doesn't know you are using 2PC until you prepare). Judging by the kinterbasdb and cx_Oracle code, they can do so as well. This isn't just a "single backend" feature. James. From mal at egenix.com Mon Jan 21 13:19:25 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 21 Jan 2008 13:19:25 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net> <479488A2.5000603@egenix.com> Message-ID: <47948DCD.6000900@egenix.com> On 2008-01-21 13:10, James Henstridge wrote: > On 21/01/2008, M.-A. Lemburg wrote: >> Thanks for the explanations. I was actually thinking of the >> connection between the TM and the RM (the database backend). >> The typical behavior of a TM is to cancel the ongoing >> two-phase commit transaction if an RM becomes unavailable. > > Stuart's use case is if the RM dies during the second phase of the > commit. It is an edge case, but then 2PC is all about edge cases :) > > If it happens before that point, then rolling back is appropriate. > > >> However, I can see your point. If the data stays on the >> database server and can be addressed via the XID, then a >> dropped connection wouldn't hurt all that much. >> Then again: how do you tell the database to forget about >> the data stored for an XID ? > > You ask for the transaction to be rolled back (e.g. "ROLLBACK PREPARED > xid" in PostgreSQL, and "XA ROLLBACK xid" for MySQL). Sorry, I wasn't clear enough: If a connection fails and the transaction XID persists, how do you: * identify which XIDs are still pending (xa_recover) * tell the RM to drop all resources associacted with an XID (xa_forget) once the TM has reconnected. These APIs appear to be needed in order for the TM to be able to cleanup the RM after e.g. a lost connection. OTOH, perhaps just doing a rollback with the known XID and ignoring any errors would do the same without the need for extra APIs. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Mon Jan 21 13:23:17 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 21 Jan 2008 13:23:17 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> <1200916385.4685.48.camel@mila.office.dinunzioedigregorio> Message-ID: <47948EB5.108@egenix.com> On 2008-01-21 13:12, James Henstridge wrote: > On 21/01/2008, Federico Di Gregorio wrote: >>> Perhaps an argument to xa_commit() would be appropriate here? >>> >>> connection.xa_commit(onephase=True) >>> >>> Without the argument, the commit would be considered to be a >>> ProgrammingError. That would reduce the chance of programmer error >>> leading to data corruption. >> Lets not make an API that has features useful on a single backend. I >> suppose the necessity for a one-phase commit in a two-phase transaction >> is rare. A simple API means early adoption by most of the adapters. > > Well, Postgres lets you commit a 2PC transaction before preparing it > too (after all, it doesn't know you are using 2PC until you prepare). > > Judging by the kinterbasdb and cx_Oracle code, they can do so as well. > This isn't just a "single backend" feature. Mixing one-phase and two-phase commits sounds like mixing two concepts that don't belong together, IMHO. It would be too easy for an application to issue a .commit() somewhere and thereby breaking the whole two phase commit idea. I'd rather like to see the two concepts well separated and exceptions raised if you try to mix them. After all, you could still open a second connection if you need one phase transactions for some other purpose. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Mon Jan 21 13:36:25 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 21:36:25 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47948DCD.6000900@egenix.com> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <47947AE9.7010101@egenix.com> <479482A4.7000504@stuartbishop.net> <479488A2.5000603@egenix.com> <47948DCD.6000900@egenix.com> Message-ID: On 21/01/2008, M.-A. Lemburg wrote: > Sorry, I wasn't clear enough: > > If a connection fails and the transaction XID persists, how do you: > > * identify which XIDs are still pending (xa_recover) > > * tell the RM to drop all resources associacted with an XID > (xa_forget) > > once the TM has reconnected. These APIs appear to be needed > in order for the TM to be able to cleanup the RM after e.g. > a lost connection. > > OTOH, perhaps just doing a rollback with the known XID and > ignoring any errors would do the same without the need for > extra APIs. There is nothing in the proposal I sent about recovery as I considered it out of scope for the initial API. Given the interest, it is probably worth adding. Finding out about outstanding transactions could be done with a Connection.xa_recover() method that returns a list of transaction IDs. In PostgreSQL this can be implemented with "SELECT gid from pg_prepared_xacts". For MySQL it can be implemented with "XA RECOVER". I don't know about others. For the xa_forget() call, does it differ from rolling back a prepared transaction? James. From stuart at stuartbishop.net Mon Jan 21 14:00:06 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon, 21 Jan 2008 20:00:06 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47948EB5.108@egenix.com> References: <1200911335.4685.24.camel@mila.office.dinunzioedigregorio> <1200914177.4685.43.camel@mila.office.dinunzioedigregorio> <1200916385.4685.48.camel@mila.office.dinunzioedigregorio> <47948EB5.108@egenix.com> Message-ID: <47949756.8010400@stuartbishop.net> M.-A. Lemburg wrote: > Mixing one-phase and two-phase commits sounds like mixing two > concepts that don't belong together, IMHO. > > It would be too easy for an application to issue a .commit() > somewhere and thereby breaking the whole two phase commit > idea. I'm not sure this is worth worrying about - applications can screw things up right now by issuing COMMITs or ROLLBACKS when shouldn't. > I'd rather like to see the two concepts well separated and > exceptions raised if you try to mix them. > > After all, you could still open a second connection if you > need one phase transactions for some other purpose. At the start of a transaction, you might not know that only one of your data stores is going to be modified. Two phase commit imposes an overhead which can be avoided if only one of your data stores turns out to need changes. I believe this is why in PostgreSQL you declare you are using 2PC at the end of your transaction and why MySQL offers you the XA COMMIT xid ONE PHASE option. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080121/625ae5f0/attachment.pgp From james at jamesh.id.au Mon Jan 21 15:40:23 2008 From: james at jamesh.id.au (James Henstridge) Date: Mon, 21 Jan 2008 23:40:23 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: Message-ID: On 21/01/2008, James Henstridge wrote: > On 18/01/2008, James Henstridge wrote: > > So is there any recommendations for what a two-phase commit API should > > look like? > > I did a bit of investigation into a few databases, and came up with a > proposal for an extension to the DB-API. Here is an updated version of the proposal. It removes the analysis of the different databases, and updates the proposed API to match what we've been discussing here. I've added a section about what the "xid" arguments to the various methods should look like. That could probably do with some more discussion as I am not too sure about it. I've also included support for transaction recovery in the form of an xa_recover() method and calling the xa_commit()/xa_rollback() methods with a transaction ID as an argument. James. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: two-phase-commit-v2.txt Url: http://mail.python.org/pipermail/db-sig/attachments/20080121/82a9fb5f/attachment-0001.txt From dieter at handshake.de Mon Jan 21 18:46:54 2008 From: dieter at handshake.de (Dieter Maurer) Date: Mon, 21 Jan 2008 18:46:54 +0100 Subject: [DB-SIG] Any standard for two phase commit APIs? In-Reply-To: References: <479071E5.7000000@egenix.com> <47908237.6010001@taupro.com> <47908D97.1080304@egenix.com> <18320.65237.930087.237768@gargle.gargle.HOWL> Message-ID: <18324.55950.269176.896724@gargle.gargle.HOWL> James Henstridge wrote at 2008-1-21 14:00 +0900: >On 19/01/2008, Dieter Maurer wrote: >> It does this indeed. >> >> And it assumes that a resource manager accepts a vote only >> when it can garantee that the subsequent "commit" will succeed (and >> does not fail). >> >> A resource manager needs to expose both a "vote" (with the above garantee) >> and a "commit" in order to be a first class participant of >> Zope's transaction system. >> >> Relational database interfaces often lack the equivalent of a "vote". > >I'd disagree with this description. From the Zope transaction >documentation, the order of methods is: > > tpc_begin commit tpc_vote (tpc_finish | tpc_abort) > >>From the descriptions of the various methods, a database adapter >supporting 2PC would prepare the transaction at commit(), and commit >or rollback that transaction in tpc_finish or tpc_abort respectively. > >After preparing the transaction, the transaction should be committable >under normal circumstances, so it would have no reason to vote no as >part of tpc_vote(). > >I disagree that the lack of a tpc_vote() method makes the database >adapter a second class citizen: it simply reflects the fact that the >adapter makes up its mind at the commit() stage independent of what >other data managers do. I agree with you. The distinction between "commit" and "vote" is probably only for historical reasons: Formerly, "objects" registered with the transaction, not resource managers. In the "commit", the registered objects where individually committed, then the "resource managers" where asked for their vote. Nowadays, resource manager register with the transaction and we can freely move functions between "commit" and "vote". What should be clear: despite its name "commit" is not a true commit, neither is "commit" followed by "vote". Both together need to prepare the commit which must eventually succeed when "finish" is called. -- Dieter From dieter at handshake.de Mon Jan 21 19:36:29 2008 From: dieter at handshake.de (Dieter Maurer) Date: Mon, 21 Jan 2008 19:36:29 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: Message-ID: <18324.58925.102352.666954@gargle.gargle.HOWL> James Henstridge wrote at 2008-1-21 23:40 +0900: > ... >= DB-API Two-Phase Commit = > >Many databases have support for two-phase commit. Adapters for some >of these databases expose this support, but often through mutually >incompatible extensions to the DB-API standard. > >Standardising the API for two-phase commit would make it easier for >applications and libraries to support two-phase commit with multiple >databases. > > >== Connection Methods == > >A database adapter that supports two phase commit (2PC) shall provide >the following additional methods on its connection object: > > .xa_begin(xid) > > Begins a 2PC transaction with the given ID. This method > should be called outside of a transaction (i.e. nothing may > have executed since the last .commit() or .rollback()). > > Furthermore, it is an error to call .commit() or .rollback() > within the 2PC transaction (what error?). > > If the database does not support 2PC, a NotSupportedError will > be raised. > > .xa_prepare() > > Performs the first phase of a transaction started with > xa_begin(). It is an error to call this method outside of a > 2PC transaction. > > After calling xa_prepare(), no statements can be executed > until xa_commit() or xa_rollback() have been called. > > .xa_commit(xid=None, onephase=False) > > When called with no arguments, xa_commit() commits a 2PC > transaction previously prepared with xa_prepare(). > > When called as xa_commit(onephase=True), it may be used to > commit the transaction prior to calling xa_prepare(). This > may occur if only a single resource ends up participating in > the global transaction. > > When called as xa_commit(xid), it commits the given > transaction. If an invalid transaction ID is provided, a > DatabaseError will be raised. This form should be called > outside of a transaction, and is intended for use in recovery. > > On return, the 2PC transaction is ended. > > .xa_rollback(xid=None) > > When called with no arguments, xa_rollback() rolls back a 2PC > transaction. It may be called before or after xa_prepare(). > > When called as xa_commit(xid), it rolls back the given > transaction. If an invalid transaction ID is provided, a > DatabaseError will be raised. This form should be called > outside of a transaction, and is intended for use in recovery. > > On return, the 2PC transaction is ended. > > .xa_recover() > > Returns a list of pending transaction IDs suitable for use > with xa_commit(xid) or xa_rollback(xid). > > If the database does not support transaction recovery, it may > return an empty list or NotSupportedError. I would prefer, if * "xa_begin" would be optional the current DB API performs automatic "begin" when there is a need for it. * the transaction id be chosen automatically, optinally guided by "Connection" configuration (to obtain "readable" transaction ids) * the use of "prepare_transaction" triggers a two phase commit -- otherwise a one phase commit is used. -- Dieter From james at jamesh.id.au Tue Jan 22 00:17:18 2008 From: james at jamesh.id.au (James Henstridge) Date: Tue, 22 Jan 2008 08:17:18 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18324.58925.102352.666954@gargle.gargle.HOWL> References: <18324.58925.102352.666954@gargle.gargle.HOWL> Message-ID: On 22/01/2008, Dieter Maurer wrote: > James Henstridge wrote at 2008-1-21 23:40 +0900: > > ... > >= DB-API Two-Phase Commit = > > > >Many databases have support for two-phase commit. Adapters for some > >of these databases expose this support, but often through mutually > >incompatible extensions to the DB-API standard. > > > >Standardising the API for two-phase commit would make it easier for > >applications and libraries to support two-phase commit with multiple > >databases. > > > > > >== Connection Methods == > > > >A database adapter that supports two phase commit (2PC) shall provide > >the following additional methods on its connection object: > > > > .xa_begin(xid) > > > > Begins a 2PC transaction with the given ID. This method > > should be called outside of a transaction (i.e. nothing may > > have executed since the last .commit() or .rollback()). > > > > Furthermore, it is an error to call .commit() or .rollback() > > within the 2PC transaction (what error?). > > > > If the database does not support 2PC, a NotSupportedError will > > be raised. > > > > .xa_prepare() > > > > Performs the first phase of a transaction started with > > xa_begin(). It is an error to call this method outside of a > > 2PC transaction. > > > > After calling xa_prepare(), no statements can be executed > > until xa_commit() or xa_rollback() have been called. > > > > .xa_commit(xid=None, onephase=False) > > > > When called with no arguments, xa_commit() commits a 2PC > > transaction previously prepared with xa_prepare(). > > > > When called as xa_commit(onephase=True), it may be used to > > commit the transaction prior to calling xa_prepare(). This > > may occur if only a single resource ends up participating in > > the global transaction. > > > > When called as xa_commit(xid), it commits the given > > transaction. If an invalid transaction ID is provided, a > > DatabaseError will be raised. This form should be called > > outside of a transaction, and is intended for use in recovery. > > > > On return, the 2PC transaction is ended. > > > > .xa_rollback(xid=None) > > > > When called with no arguments, xa_rollback() rolls back a 2PC > > transaction. It may be called before or after xa_prepare(). > > > > When called as xa_commit(xid), it rolls back the given > > transaction. If an invalid transaction ID is provided, a > > DatabaseError will be raised. This form should be called > > outside of a transaction, and is intended for use in recovery. > > > > On return, the 2PC transaction is ended. > > > > .xa_recover() > > > > Returns a list of pending transaction IDs suitable for use > > with xa_commit(xid) or xa_rollback(xid). > > > > If the database does not support transaction recovery, it may > > return an empty list or NotSupportedError. > > I would prefer, if > > * "xa_begin" would be optional > > the current DB API performs automatic "begin" when there is a > need for it. Please see the notes I wrote about the requirements for 2PC in various databases. For some of them, there is a different set of commands to start a normal transaction and a 2PC transaction. Going with implicit begin would lead to ambiguity about what sort of transaction to start. > * the transaction id be chosen automatically, optinally guided > by "Connection" configuration (to obtain "readable" transaction ids) Note that transaction IDs are per-transaction rather than per-connection, and usually assigned by the transaction manager (so that there is a common portion for the IDs of all participating resources). I don't think a connection-time setting will cut it. > * the use of "prepare_transaction" triggers a two phase commit -- > otherwise a one phase commit is used. As mentioned earlier, some of the databases want to know that 2PC is a possibility at the start of the transaction. As setting up a 2PC transaction is more expensive, you probably don't want to enable them in cases where they won't be used. Deferring the decision until the prepare() stage essentially forces the application to pay the price with some databases. James. From stuart at stuartbishop.net Tue Jan 22 09:48:50 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 22 Jan 2008 15:48:50 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: Message-ID: <4795ADF2.1040109@stuartbishop.net> James Henstridge wrote: > Here is an updated version of the proposal. It removes the analysis > of the different databases, and updates the proposed API to match what > we've been discussing here. I'm happy with the design. I personally don't think we should use the xa prefix, as this will make people think that this is an XA interface when it isn't - an XA like API would be something built on top of this. I would think the following would be better names: con.begin_prepared(xid=None) con.prepare_transaction() con.rollback_prepared(xid=None) con.commit_prepared(xid=None) con.list_prepared() > I've added a section about what the "xid" arguments to the various > methods should look like. That could probably do with some more > discussion as I am not too sure about it. > > I've also included support for transaction recovery in the form of an > xa_recover() method and calling the xa_commit()/xa_rollback() methods > with a transaction ID as an argument. It seems that the formatID is unnecessary and just a requirement of the XA C interface. Also, the xid() method you propose should be camelcase to match the other type constructors, so Xid(gtrid, bqual=None) or TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method returns TransactionId objects they can contain platform specific information too which is great (username, prepared timestamp & database for PostgreSQL for instance). -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/76ccc0ca/attachment.pgp From fog at initd.org Tue Jan 22 10:42:51 2008 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 22 Jan 2008 10:42:51 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795ADF2.1040109@stuartbishop.net> References: <4795ADF2.1040109@stuartbishop.net> Message-ID: <1200994971.4091.11.camel@mila.office.dinunzioedigregorio> Il giorno mar, 22/01/2008 alle 15.48 +0700, Stuart Bishop ha scritto: > > I'm happy with the design. > I am very happy too, but.. > > I would think the following would be better names: > con.begin_prepared(xid=None) > con.prepare_transaction() > con.rollback_prepared(xid=None) > con.commit_prepared(xid=None) > con.list_prepared() I don't like "prepared" because can be interpreted as "prepared transaction" as as "prepared statement". I'd like something that won't confuse users. I agree that xa is too specific. That leaves us with the long prepared_transaction or something generic like twophase (or 2pc or tpc_prefix, or?) begin_prepared_transaction() prepare_transaction() rollback_prepared_transaction() ... begin_2pc_transaction() prepare_2pc_transaction() rollback_2pc_transaction() ... begin_twophase() prepare_twophase() rollback_twophase() ... tpc_begin() tpc_prepare() tpc_rollback() federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org All programmers are optimists. -- Frederick P. Brooks, Jr. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/19928c9d/attachment.pgp From mal at egenix.com Tue Jan 22 11:34:54 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 11:34:54 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: Message-ID: <4795C6CE.8060301@egenix.com> On 2008-01-21 15:40, James Henstridge wrote: > On 21/01/2008, James Henstridge wrote: >> On 18/01/2008, James Henstridge wrote: >>> So is there any recommendations for what a two-phase commit API should >>> look like? >> I did a bit of investigation into a few databases, and came up with a >> proposal for an extension to the DB-API. > > Here is an updated version of the proposal. It removes the analysis > of the different databases, and updates the proposed API to match what > we've been discussing here. > > I've added a section about what the "xid" arguments to the various > methods should look like. That could probably do with some more > discussion as I am not too sure about it. > > I've also included support for transaction recovery in the form of an > xa_recover() method and calling the xa_commit()/xa_rollback() methods > with a transaction ID as an argument. Thanks. I like it a lot, except for making the XID an object - this always appears to be a string in all the backends you've checked and also in the XA standard, so I'd go for a simple string instead of an object (those are always lots of work to do at C level). Regarding the "xa_" prefix, I'm not much attached to it, but since the interface does indeed look a lot like the XA interface, why not make that reference ? It also makes it clear, that the interface sits on top of the standard DB-API connection API and that those methods form a unit. Plus they are currently not in use by any DB-API module, so don't interfere with existing APIs. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Tue Jan 22 12:33:39 2008 From: james at jamesh.id.au (James Henstridge) Date: Tue, 22 Jan 2008 20:33:39 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795C6CE.8060301@egenix.com> References: <4795C6CE.8060301@egenix.com> Message-ID: On 22/01/2008, M.-A. Lemburg wrote: > Thanks. I like it a lot, except for making the XID an object - this > always appears to be a string in all the backends you've checked and > also in the XA standard, so I'd go for a simple string instead of > an object (those are always lots of work to do at C level). In at least MySQL and Oracle, the transaction ID appears to be more than just a string: it is structured into three parts: * a format ID * a global transaction ID * a branch qualifier Stuart has made the argument that the format ID is not important for Python, and I tend to agree (or at least I don't know what situations you'd use it). I do see a use for the branch qualifier though. In a distributed transaction, each resource should have a different transaction ID that share a common global transaction ID but separate branch qualifiers. As transaction IDs are global within database clusters for some backends (PostgreSQL, MySQL and probably others), the branch qualifier is necessary if two databases from the cluster are used in the global transaction. I think it is worth making the API such that it is easy to program to best practices. > Regarding the "xa_" prefix, I'm not much attached to it, but since > the interface does indeed look a lot like the XA interface, why not > make that reference ? Stuart's argument is that if the API differs from XA then using the xa_* prefix could be problematic for adapters that want to expose the XA API. As I don't have any experience with using XA, I can't comment one way or the other about this. > It also makes it clear, that the interface > sits on top of the standard DB-API connection API and that those > methods form a unit. Having a common prefix seems sensible. If we don't use xa_*, Federico's suggestion of tpc_* might make sense. > Plus they are currently not in use by any DB-API module, so don't > interfere with existing APIs. So I guess it comes down to the following questions: 1. Are database adapters likely to want to expose more than what is covered by this proposal? 2. Would this proposed API conflict with those extensions? It isn't clear to me that people want to provide a larger API, since the few adapters that have added 2PC support have done so with APIs that are effectively a subset/simplification of this one. James. From james at jamesh.id.au Tue Jan 22 12:36:17 2008 From: james at jamesh.id.au (James Henstridge) Date: Tue, 22 Jan 2008 20:36:17 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795ADF2.1040109@stuartbishop.net> References: <4795ADF2.1040109@stuartbishop.net> Message-ID: On 22/01/2008, Stuart Bishop wrote: > It seems that the formatID is unnecessary and just a requirement of the XA C > interface. Also, the xid() method you propose should be camelcase to match > the other type constructors, so Xid(gtrid, bqual=None) or > TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method > returns TransactionId objects they can contain platform specific information > too which is great (username, prepared timestamp & database for PostgreSQL > for instance). Well, the DB-API does not actually expose any classes other than the exceptions. The primary objects you work with are all created by factory functions/methods: * Connections from module.connect() * Cursors from connection.cursor() I was suggesting that transaction ID objects be created by either a module.xid() or connection.xid() factory function and not make the class object part of the API. James. From mal at egenix.com Tue Jan 22 12:56:20 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 12:56:20 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> Message-ID: <4795D9E4.2030509@egenix.com> On 2008-01-22 12:33, James Henstridge wrote: > On 22/01/2008, M.-A. Lemburg wrote: >> Thanks. I like it a lot, except for making the XID an object - this >> always appears to be a string in all the backends you've checked and >> also in the XA standard, so I'd go for a simple string instead of >> an object (those are always lots of work to do at C level). > > In at least MySQL and Oracle, the transaction ID appears to be more > than just a string: it is structured into three parts: > * a format ID > * a global transaction ID > * a branch qualifier > > Stuart has made the argument that the format ID is not important for > Python, and I tend to agree (or at least I don't know what situations > you'd use it). The format id is only used to specify the format of the data structure in the XA xid_struct_t: >From http://www.opengroup.org/onlinepubs/009680699/toc.pdf: """ Although "xa.h" constrains the length and byte alignment of the data element within an XID, it does not specify the data's contents. The only requirement is that both gtrid and bqual, taken together, must be globally unique. The recommended way of achieving global uniqueness is to use the naming rules specified for OSI CCR atomic action identifiers (see the referenced OSI CCR specification). If OSI CCR naming is used, then the XID's formatID element should be set to 0; if some other format is used, then the formatID element should be greater than 0. A value of -1 in formatID means that the XID is null. The RM must be able to map the XID to the recoverable work it did for the corresponding branch. RMs may perform bitwise comparisons on the data components of an XID for the lengths specified in the XID structure. Most XA routines pass a pointer to the XID. These pointers are valid only for the duration of the call. If the RM needs to refer to the XID after it returns from the call, it must make a local copy before returning. /* * Transaction branch identification: XID and NULLXID: */ #define XIDDATASIZE 128 /* size in bytes */ #define MAXGTRIDSIZE 64 /* maximum size in bytes of gtrid */ #define MAXBQUALSIZE 64 /* maximum size in bytes of bqual */ struct xid_t { long formatID; /* format identifier */ long gtrid_length; /* value 1-64 */ long bqual_length; /* value 1-64 */ char data[XIDDATASIZE]; }; typedef struct xid_t XID; """ So, essentially, only the global transaction id and the branch id are relevant and both are represented in the data string. BTW, there's a nice extension module that let's you hook Python between the TM and RM using XA: http://www.hare.demon.co.uk/pyxasw/ > I do see a use for the branch qualifier though. In a distributed > transaction, each resource should have a different transaction ID that > share a common global transaction ID but separate branch qualifiers. > > As transaction IDs are global within database clusters for some > backends (PostgreSQL, MySQL and probably others), the branch qualifier > is necessary if two databases from the cluster are used in the global > transaction. > > I think it is worth making the API such that it is easy to program to > best practices. The DB-API has always tried to not get in the way of how a particular backends needs its configuration data, so I think we can still have a single string using a database backend specific format. This could then include one or more of the above id parts. The implementation can then decode the string representation of the transaction id components into whatever format is needed by the backend. >> Regarding the "xa_" prefix, I'm not much attached to it, but since >> the interface does indeed look a lot like the XA interface, why not >> make that reference ? > > Stuart's argument is that if the API differs from XA then using the > xa_* prefix could be problematic for adapters that want to expose the > XA API. > > As I don't have any experience with using XA, I can't comment one way > or the other about this. Fair enough. The API does resemble XA a lot, but you're right: if there are differences, it's better not to make that link. >> It also makes it clear, that the interface >> sits on top of the standard DB-API connection API and that those >> methods form a unit. > > Having a common prefix seems sensible. If we don't use xa_*, > Federico's suggestion of tpc_* might make sense. Fine, let's use "tpc_". >> Plus they are currently not in use by any DB-API module, so don't >> interfere with existing APIs. > > So I guess it comes down to the following questions: > 1. Are database adapters likely to want to expose more than what is > covered by this proposal? > 2. Would this proposed API conflict with those extensions? > > It isn't clear to me that people want to provide a larger API, since > the few adapters that have added 2PC support have done so with APIs > that are effectively a subset/simplification of this one. If there's more to expose than what's in the API spec, then module authors are free to do so. In general, the DB-API only defines a fully functional common subset of what has to be there to use a database backend. Extensions are possible and welcome. Every now and then, we consider adding those extensions as "standard extensions" to the DB-API. This has proven to work well in the past. The two-phase commit methods would be another set of those extensions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From stuart at stuartbishop.net Tue Jan 22 13:34:44 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 22 Jan 2008 19:34:44 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795ADF2.1040109@stuartbishop.net> Message-ID: <4795E2E4.4010504@stuartbishop.net> James Henstridge wrote: > On 22/01/2008, Stuart Bishop wrote: >> It seems that the formatID is unnecessary and just a requirement of the XA C >> interface. Also, the xid() method you propose should be camelcase to match >> the other type constructors, so Xid(gtrid, bqual=None) or >> TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method >> returns TransactionId objects they can contain platform specific information >> too which is great (username, prepared timestamp & database for PostgreSQL >> for instance). > > Well, the DB-API does not actually expose any classes other than the > exceptions. The primary objects you work with are all created by > factory functions/methods: The camelcase suggestion was to match the other type constructors as documented under "Type Objects & Constructors", such as Date, Time, Timestamp, Binary. > * Connections from module.connect() > * Cursors from connection.cursor() > > I was suggesting that transaction ID objects be created by either a > module.xid() or connection.xid() factory function and not make the > class object part of the API. Sure - the class object doesn't need to be part of the API, but xa_recover needs to return a list of something and the behaviour of those somethings needs to be defined. I imagined that would be an object providing .transaction_id & .branch_qualifier at a minimum, and the driver can add in whatever platform specific attributes or behaviour it wants. The xid objects can't be opaque as a transaction manager needs to be able to filter out the relevant from irrelevant. (From the other threads, I'm happy with tpc_ naming). -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/ed0f52ec/attachment.pgp From mal at egenix.com Tue Jan 22 13:42:06 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 13:42:06 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795E2E4.4010504@stuartbishop.net> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> Message-ID: <4795E49E.9030900@egenix.com> On 2008-01-22 13:34, Stuart Bishop wrote: > James Henstridge wrote: >> On 22/01/2008, Stuart Bishop wrote: >>> It seems that the formatID is unnecessary and just a requirement of the XA C >>> interface. Also, the xid() method you propose should be camelcase to match >>> the other type constructors, so Xid(gtrid, bqual=None) or >>> TransactionId(gtrid, bqual=None). If the xa_recover/list_prepared method >>> returns TransactionId objects they can contain platform specific information >>> too which is great (username, prepared timestamp & database for PostgreSQL >>> for instance). >> Well, the DB-API does not actually expose any classes other than the >> exceptions. The primary objects you work with are all created by >> factory functions/methods: > > The camelcase suggestion was to match the other type constructors as > documented under "Type Objects & Constructors", such as Date, Time, > Timestamp, Binary. > >> * Connections from module.connect() >> * Cursors from connection.cursor() >> >> I was suggesting that transaction ID objects be created by either a >> module.xid() or connection.xid() factory function and not make the >> class object part of the API. > > Sure - the class object doesn't need to be part of the API, but xa_recover > needs to return a list of something and the behaviour of those somethings > needs to be defined. It only needs to be defined in the context of the module exposing that recover API, since you'd only pass it back to the methods of that same API. We could just describe the transaction id as object in the spec and then have the modules decide what type this maps to, e.g. one module might want to use a tuple (or even namedtuple) for this, another might not want to bother at all and use the internal representation mapped to a string or bytes object. > I imagined that would be an object providing > .transaction_id & .branch_qualifier at a minimum, and the driver can add in > whatever platform specific attributes or behaviour it wants. The xid objects > can't be opaque as a transaction manager needs to be able to filter out the > relevant from irrelevant. > > (From the other threads, I'm happy with tpc_ naming). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From stuart at stuartbishop.net Tue Jan 22 14:09:58 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 22 Jan 2008 20:09:58 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795E49E.9030900@egenix.com> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com> Message-ID: <4795EB26.7090300@stuartbishop.net> M.-A. Lemburg wrote: > It only needs to be defined in the context of the module exposing > that recover API, since you'd only pass it back to the methods of > that same API. > > We could just describe the transaction id as object in the spec and > then have the modules decide what type this maps to, e.g. one module > might want to use a tuple (or even namedtuple) for this, another > might not want to bother at all and use the internal representation > mapped to a string or bytes object. From the XA pdf you linked to earlier on xa_recover: "A transaction manager calls xa_recover() during recovery to obtain a list of transaction branches that are currently in a prepared or heuristically completed state. [...] "It is the transaction manager?s responsibility to ignore XIDs that do not belong to it. So if you where to implement an XA like interface around this, how can a transaction manager filter out the irrelevant XIDs if is cannot interrogate them? If behaviour of the xids returned by tpc_recover is not defined, we need another method to decompose an xid into its global transaction id and its branch id. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/52a62aeb/attachment.pgp From mal at egenix.com Tue Jan 22 14:23:12 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 14:23:12 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795EB26.7090300@stuartbishop.net> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com> <4795EB26.7090300@stuartbishop.net> Message-ID: <4795EE40.4040704@egenix.com> On 2008-01-22 14:09, Stuart Bishop wrote: > M.-A. Lemburg wrote: > >> It only needs to be defined in the context of the module exposing >> that recover API, since you'd only pass it back to the methods of >> that same API. >> >> We could just describe the transaction id as object in the spec and >> then have the modules decide what type this maps to, e.g. one module >> might want to use a tuple (or even namedtuple) for this, another >> might not want to bother at all and use the internal representation >> mapped to a string or bytes object. > > > From the XA pdf you linked to earlier on xa_recover: > > "A transaction manager calls xa_recover() during recovery to obtain a > list of transaction branches that are currently in a prepared or > heuristically completed state. > > [...] > > "It is the transaction manager's responsibility to ignore XIDs that do > not belong to it. > > So if you where to implement an XA like interface around this, how can a > transaction manager filter out the irrelevant XIDs if is cannot interrogate > them? Good point, but I actually think that this refers to the TM storing the XIDs it knows about and ignoring any other XIDs returned by the recover method. I don't think that the TM is required to understand the format of the XID since the resource managers fill in that data and only they have to be able to recognize it. Then again, it may be useful for other purposes. Since there are only two id components that appear to be relevant, how about using a 2-tuple for the transaction id ? > If behaviour of the xids returned by tpc_recover is not defined, we need > another method to decompose an xid into its global transaction id and its > branch id. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From fog at initd.org Tue Jan 22 14:27:20 2008 From: fog at initd.org (Federico Di Gregorio) Date: Tue, 22 Jan 2008 14:27:20 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795EE40.4040704@egenix.com> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com> <4795EB26.7090300@stuartbishop.net> <4795EE40.4040704@egenix.com> Message-ID: <1201008440.4091.20.camel@mila.office.dinunzioedigregorio> Il giorno mar, 22/01/2008 alle 14.23 +0100, M.-A. Lemburg ha scritto: > Since there are only two id components that appear to be relevant, > how about using a 2-tuple for the transaction id ? ...and modules that want to use a custom object can always implement the tuple interface and stay compatible with the API. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org Il panda ha l'apparato digerente di un carnivoro (e.g., di un orso). Il panda ha scelto di cibarsi esclusivamente di germogli di bamb?. Quindi, il panda ? l'unico animale vegano del pianeta. Il panda merita di estinguersi. -- Maria, Alice, Federico -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080122/0191b288/attachment.pgp From dieter at handshake.de Tue Jan 22 19:52:10 2008 From: dieter at handshake.de (Dieter Maurer) Date: Tue, 22 Jan 2008 19:52:10 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795E49E.9030900@egenix.com> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com> Message-ID: <18326.15194.365181.823524@gargle.gargle.HOWL> M.-A. Lemburg wrote at 2008-1-22 13:42 +0100: > ... >We could just describe the transaction id as object in the spec and >then have the modules decide what type this maps to, e.g. one module >might want to use a tuple (or even namedtuple) for this, another >might not want to bother at all and use the internal representation >mapped to a string or bytes object. I learned (from James remark) that transaction ids belong to the transaction manager and not the resource. Thus, at least the individual "drivers" should not use different implementations for transaction ids. -- Dieter From mal at egenix.com Tue Jan 22 20:26:00 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 20:26:00 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18326.15194.365181.823524@gargle.gargle.HOWL> References: <4795ADF2.1040109@stuartbishop.net> <4795E2E4.4010504@stuartbishop.net> <4795E49E.9030900@egenix.com> <18326.15194.365181.823524@gargle.gargle.HOWL> Message-ID: <47964348.1050601@egenix.com> On 2008-01-22 19:52, Dieter Maurer wrote: > M.-A. Lemburg wrote at 2008-1-22 13:42 +0100: >> ... >> We could just describe the transaction id as object in the spec and >> then have the modules decide what type this maps to, e.g. one module >> might want to use a tuple (or even namedtuple) for this, another >> might not want to bother at all and use the internal representation >> mapped to a string or bytes object. > > I learned (from James remark) that transaction ids belong to the > transaction manager and not the resource. > > Thus, at least the individual "drivers" should not use different > implementations for transaction ids. You're right. I misunderstood which component manages the transaction id (xid). It's the transaction manager, not the resource manager. And it's the database modules that must accept whatever the TM passes them, not the other way around. Would a tuple (global transaction id, branch id) do the trick or should we have two parameters on each API instead ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Tue Jan 22 20:31:14 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 20:31:14 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18326.14996.49402.907419@gargle.gargle.HOWL> References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> Message-ID: <47964482.5030309@egenix.com> On 2008-01-22 19:48, Dieter Maurer wrote: > James Henstridge wrote at 2008-1-22 20:33 +0900: >> ... >> I do see a use for the branch qualifier though. In a distributed >> transaction, each resource should have a different transaction ID > > Why? > Why is it not equally good to use a common transaction id for > all resource managers? > >> that >> share a common global transaction ID but separate branch qualifiers. >> >> As transaction IDs are global within database clusters for some >> backends (PostgreSQL, MySQL and probably others), the branch qualifier >> is necessary if two databases from the cluster are used in the global >> transaction. > > They refer to the same transaction -- even when several databases > in a cluster are affected. > > The transaction as a whole will want to get prepared, committed, rolledback... Sections 2.2.5 and 2.2.6 explain why you need a global transaction id and a branch id as well: http://www.opengroup.org/onlinepubs/009680699/toc.pdf Branch ids are used for e.g. multiple connections of the same RM engaging in a global transaction. Each of those connections gets its own branch id. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From dieter at handshake.de Tue Jan 22 20:54:22 2008 From: dieter at handshake.de (Dieter Maurer) Date: Tue, 22 Jan 2008 20:54:22 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47964482.5030309@egenix.com> References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> Message-ID: <18326.18926.708382.939543@gargle.gargle.HOWL> M.-A. Lemburg wrote at 2008-1-22 20:31 +0100: > ... >Branch ids are used for e.g. multiple connections of the same RM >engaging in a global transaction. Each of those connections gets >its own branch id. But using multiple connections to the same RM seems to be an error in the first place. Assume that a resource "R" is locked via connection "C1". Assume than that "R" is requested via connection "C2". If "C1 == C2", then the RM can see that the resource is already assigned to the connection and there is no blocking. Otherwise, the RM has not chance to recognize this and the request will be blocked until the transaction is commited or rolled back. There is quite a high chance, that since the "R" request is blocked, there will be no commit/roll back.... -- Dieter From mal at egenix.com Tue Jan 22 22:46:24 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 Jan 2008 22:46:24 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18326.18926.708382.939543@gargle.gargle.HOWL> References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> <18326.18926.708382.939543@gargle.gargle.HOWL> Message-ID: <47966430.7010005@egenix.com> On 2008-01-22 20:54, Dieter Maurer wrote: > M.-A. Lemburg wrote at 2008-1-22 20:31 +0100: >> ... >> Branch ids are used for e.g. multiple connections of the same RM >> engaging in a global transaction. Each of those connections gets >> its own branch id. > > But using multiple connections to the same RM seems to > be an error in the first place. > > Assume that a resource "R" is locked via connection "C1". > Assume than that "R" is requested via connection "C2". > > If "C1 == C2", then the RM can see that the resource is already > assigned to the connection and there is no blocking. > > Otherwise, the RM has not chance to recognize this and > the request will be blocked until the transaction is commited > or rolled back. There is quite a high chance, that since the > "R" request is blocked, there will be no commit/roll back.... This situation is well possible, but it's still a rather common case: if an application uses multiple threads, then each of the threads will have its own connection and branch id. It's less common in the Python world (well, maybe for Zope), but very common in Java and C++ applications. Note that it's also possible that even though a connection is registered with the TM, the current global transaction doesn't affect it (e.g. because it's not executing anything at the time). It can then optimize the .tpc_commit()/ .tpc_rollback() method call (by ignoring them). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Wed Jan 23 02:18:48 2008 From: james at jamesh.id.au (James Henstridge) Date: Wed, 23 Jan 2008 10:18:48 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4795D9E4.2030509@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> Message-ID: On 22/01/2008, M.-A. Lemburg wrote: > On 2008-01-22 12:33, James Henstridge wrote: > > On 22/01/2008, M.-A. Lemburg wrote: > >> Thanks. I like it a lot, except for making the XID an object - this > >> always appears to be a string in all the backends you've checked and > >> also in the XA standard, so I'd go for a simple string instead of > >> an object (those are always lots of work to do at C level). > > > > In at least MySQL and Oracle, the transaction ID appears to be more > > than just a string: it is structured into three parts: > > * a format ID > > * a global transaction ID > > * a branch qualifier > > > > Stuart has made the argument that the format ID is not important for > > Python, and I tend to agree (or at least I don't know what situations > > you'd use it). > > The format id is only used to specify the format of the data > structure in the XA xid_struct_t: > > From http://www.opengroup.org/onlinepubs/009680699/toc.pdf: > > """ > Although "xa.h" constrains the length and byte alignment of the data element within an > XID, it does not specify the data's contents. The only requirement is that both gtrid and > bqual, taken together, must be globally unique. The recommended way of achieving > global uniqueness is to use the naming rules specified for OSI CCR atomic action > identifiers (see the referenced OSI CCR specification). If OSI CCR naming is used, then > the XID's formatID element should be set to 0; if some other format is used, then the > formatID element should be greater than 0. A value of -1 in formatID means that the > XID is null. > The RM must be able to map the XID to the recoverable work it did for the > corresponding branch. RMs may perform bitwise comparisons on the data > components of an XID for the lengths specified in the XID structure. Most XA routines > pass a pointer to the XID. These pointers are valid only for the duration of the call. If > the RM needs to refer to the XID after it returns from the call, it must make a local copy > before returning. > /* > * Transaction branch identification: XID and NULLXID: > */ > #define XIDDATASIZE 128 /* size in bytes */ > #define MAXGTRIDSIZE 64 /* maximum size in bytes of gtrid */ > #define MAXBQUALSIZE 64 /* maximum size in bytes of bqual */ > struct xid_t { > long formatID; /* format identifier */ > long gtrid_length; /* value 1-64 */ > long bqual_length; /* value 1-64 */ > char data[XIDDATASIZE]; > }; > typedef struct xid_t XID; > """ > > So, essentially, only the global transaction id and the branch id > are relevant and both are represented in the data string. One interesting part of that is the "If OSI CCR naming is used, then the XID's formatID element should be set to 0; if some other format is used, then the formatID element should be greater than 0." I took a quick look at a few J2EE servers (which use XA), to see what they do for transaction managers. Neither JBoss or Geronimo seem to use formatID=0, but instead use magic numbers that I presume are intended to determine if they created the transaction ID. That said, the selection of format identifiers seems a bit ad-hoc: Geronimo uses 0x4765526f, which has a byte representation of "GeRo". It seems that you could do pretty much the same thing by getting TMs to check the global ID itself ... > BTW, there's a nice extension module that let's you hook Python > between the TM and RM using XA: > > http://www.hare.demon.co.uk/pyxasw/ > > > I do see a use for the branch qualifier though. In a distributed > > transaction, each resource should have a different transaction ID that > > share a common global transaction ID but separate branch qualifiers. > > > > As transaction IDs are global within database clusters for some > > backends (PostgreSQL, MySQL and probably others), the branch qualifier > > is necessary if two databases from the cluster are used in the global > > transaction. > > > > I think it is worth making the API such that it is easy to program to > > best practices. > > The DB-API has always tried to not get in the way of how > a particular backends needs its configuration data, so > I think we can still have a single string using a database > backend specific format. This could then include one or more > of the above id parts. > > The implementation can then decode the string representation > of the transaction id components into whatever format is > needed by the backend. The two reasons I see for using an object to represent transactions that contains a global part and branch part are: 1. round tripping a transaction ID from xa_recover() to xa_commit()/xa_rollback(). 2. Reduced restrictions on the contents of the transaction ID. For (1), using a database adapter defined object means that it can represent transactions that originated elsewhere, or expose more information about those transactions. For (2), if a database is using specially formatted transaction IDs at the Python level that get decoded into the various components, does that mean that the application or transaction manager glue needs to know how to format the IDs. In contrast, it is pretty easy for e.g. a Postgres adapter to serialise/deserialise a multi-part ID (and this is what the JDBC driver does). > >> Regarding the "xa_" prefix, I'm not much attached to it, but since > >> the interface does indeed look a lot like the XA interface, why not > >> make that reference ? > > > > Stuart's argument is that if the API differs from XA then using the > > xa_* prefix could be problematic for adapters that want to expose the > > XA API. > > > > As I don't have any experience with using XA, I can't comment one way > > or the other about this. > > Fair enough. The API does resemble XA a lot, but you're right: > if there are differences, it's better not to make that link. > > >> It also makes it clear, that the interface > >> sits on top of the standard DB-API connection API and that those > >> methods form a unit. > > > > Having a common prefix seems sensible. If we don't use xa_*, > > Federico's suggestion of tpc_* might make sense. > > Fine, let's use "tpc_". > > >> Plus they are currently not in use by any DB-API module, so don't > >> interfere with existing APIs. > > > > So I guess it comes down to the following questions: > > 1. Are database adapters likely to want to expose more than what is > > covered by this proposal? > > 2. Would this proposed API conflict with those extensions? > > > > It isn't clear to me that people want to provide a larger API, since > > the few adapters that have added 2PC support have done so with APIs > > that are effectively a subset/simplification of this one. > > If there's more to expose than what's in the API spec, then > module authors are free to do so. > > In general, the DB-API only > defines a fully functional common subset of what has to be > there to use a database backend. Extensions are possible and > welcome. I agree with this, and think it is worth keeping extensibility in mind when designing the API. My suggestion of using an object to represent a transaction ID was to make it easier for an adapter to expose more complex IDs in a fairly localised fashion. > Every now and then, we consider adding those extensions as > "standard extensions" to the DB-API. This has proven to work > well in the past. > > The two-phase commit methods would be another set of those > extensions. Okay. James. From konjkov.vv at gmail.com Wed Jan 23 04:12:20 2008 From: konjkov.vv at gmail.com (Konjkov Vladimir) Date: Wed, 23 Jan 2008 10:12:20 +0700 Subject: [DB-SIG] PEP 249 Message-ID: in definition of .execute(operation[,parameters]) ..... A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. What meens "the same operation object is passed in again"? There's no definition for Class Operation. May by it meens SameOperation = "something that just a constant!" C = cnxn.cursor() C.execute("select * from table where a=? and b=?",(1,2)) C.fetchall() C.execute(SameOperation,(3,4)) C.fetchall() or not? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20080123/eeb2f373/attachment.htm From carsten at uniqsys.com Wed Jan 23 05:09:05 2008 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 22 Jan 2008 23:09:05 -0500 Subject: [DB-SIG] PEP 249 In-Reply-To: References: Message-ID: <1201061345.3323.4.camel@localhost.localdomain> On Wed, 2008-01-23 at 10:12 +0700, Konjkov Vladimir wrote: > in definition of > .execute(operation[,parameters]) > ..... > A reference to the operation will be retained by the > cursor. If the same operation object is passed in again, > then the cursor can optimize its behavior. > > > What meens "the same operation object is passed in again"? > There's no definition for Class Operation. > > May by it meens > > SameOperation = "something that just a constant!" > C = > cnxn.cursor() > C.execute("select * from table where a=? and b=?",(1,2)) > C.fetchall() > C.execute(SameOperation,(3,4)) > C.fetchall() No, it means this: sql = "select * from customers where cust_code = ?" C.execute(sql, (1,)) # ... C.execute(sql, (2,)) # ... Assuming that "sql" doesn't get rebound, it'll reference the same string object as before, hence the cursor object may optimize its behavior by reusing a previously prepared statement for that query instead of re-preparing the statement. HTH, -- Carsten Haese http://informixdb.sourceforge.net From james at jamesh.id.au Wed Jan 23 05:18:52 2008 From: james at jamesh.id.au (James Henstridge) Date: Wed, 23 Jan 2008 13:18:52 +0900 Subject: [DB-SIG] PEP 249 In-Reply-To: References: Message-ID: On 23/01/2008, Konjkov Vladimir wrote: > in definition of > .execute(operation[,parameters]) > ..... > A reference to the operation will be retained by the > cursor. If the same operation object is passed in again, > then the cursor can optimize its behavior. The operation object is the one passed as as the first argument to .execute(). > What meens "the same operation object is passed in again"? > There's no definition for Class Operation. It means that if you pass the same object to multiple execute() calls, the database adapter may optimise things (then again, it might not). The following is an example based on yours: query = "select * from table where a=? and b=?" C.execute(query, (1, 2)) C.execute(query, (3, 4)) So if the adapter uses prepared statements, it can see that the second execute() call uses the same query so uses the previously prepared statement. As an application developer, the thing to take away from this is that if you are going to execute the same query over an over, consider using the same string object. James. From konjkov.vv at gmail.com Wed Jan 23 07:14:41 2008 From: konjkov.vv at gmail.com (Konjkov Vladimir) Date: Wed, 23 Jan 2008 13:14:41 +0700 Subject: [DB-SIG] PEP 249 Message-ID: When I'm implementin on C my Python module that are used to access ODBC 2.0 database, I can't found description in PEP-0249 about the case when one .executeXXX follows another on the same cursor object. I think that after .executeXXX cursor can only be fetchedXXX or closed. Reexecution permited and raised exception. That's because .executeXXX method calling SQLPrepare and and next SQLPrepare posible only when SQLCloseCursor() or SQLFreeStmt() with the SQL_CLOSE option called. But on C-level reexecution is posible. "Once the application has processed the results from the SQLExecute() call, it can execute the statement again with new (or the same) parameter values." Problem is that no .Prepare(Statement) method is not present in Cursor oblect. I think it will be better if connection method of cursor have to do the SQLPrepare and only prepare the statemnet when creatin new python cursor object C = cnxn.cursor(STATEMENT), and C.execute([parameters]) will only execute or reexecute the statemnet with optional parameters list. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20080123/4a51398e/attachment.htm From james at jamesh.id.au Wed Jan 23 08:31:39 2008 From: james at jamesh.id.au (James Henstridge) Date: Wed, 23 Jan 2008 16:31:39 +0900 Subject: [DB-SIG] PEP 249 In-Reply-To: References: Message-ID: On 23/01/2008, Konjkov Vladimir wrote: > When I'm implementin on C my Python module that are used to access > ODBC 2.0 database, I can't found description in PEP-0249 about the > case when one .executeXXX follows another on the same cursor object. > > I think that after .executeXXX cursor can only be > fetchedXXX or closed. Reexecution permited and raised exception. > That's because .executeXXX method calling SQLPrepare and > and next SQLPrepare posible only when SQLCloseCursor() or > SQLFreeStmt() with the SQL_CLOSE option called. The idea is that on .execute(), the database adapter could prepare the statement and execute it. The cursor would keep the prepared statement around afterwards. On a subsequent .execute() call, if the statement is identical it can use the previously prepared statement. If not, then it discards the prepared statement and creates a new one. > But on C-level reexecution is posible. > > "Once the application has processed the results from the SQLExecute() call, > it can execute the statement again with new (or the same) parameter > values." > > Problem is that no .Prepare(Statement) method is not present in Cursor > oblect. Use of prepared statements is implicit, if the database adapter uses them at all. > I think it will be better if connection method of cursor have to do the > SQLPrepare and only prepare the statemnet when creatin new python cursor > object > C = cnxn.cursor(STATEMENT), > and C.execute([parameters]) will only execute or reexecute the statemnet > with optional parameters list. What benefits do you see from this design over the existing one? James. From mal at egenix.com Wed Jan 23 10:12:14 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 Jan 2008 10:12:14 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> Message-ID: <479704EE.8030404@egenix.com> On 2008-01-23 02:18, James Henstridge wrote: >> [XID format used in XA] >> So, essentially, only the global transaction id and the branch id >> are relevant and both are represented in the data string. > > One interesting part of that is the "If OSI CCR naming is used, then > the XID's formatID element should be set to 0; if some other format is > used, then the formatID element should be greater than 0." > > I took a quick look at a few J2EE servers (which use XA), to see what > they do for transaction managers. Neither JBoss or Geronimo seem to > use formatID=0, but instead use magic numbers that I presume are > intended to determine if they created the transaction ID. > > That said, the selection of format identifiers seems a bit ad-hoc: > Geronimo uses 0x4765526f, which has a byte representation of "GeRo". > > It seems that you could do pretty much the same thing by getting TMs > to check the global ID itself ... So we do need to store the "formatID" as well ? >> BTW, there's a nice extension module that let's you hook Python >> between the TM and RM using XA: >> >> http://www.hare.demon.co.uk/pyxasw/ > > > >>> I do see a use for the branch qualifier though. In a distributed >>> transaction, each resource should have a different transaction ID that >>> share a common global transaction ID but separate branch qualifiers. >>> >>> As transaction IDs are global within database clusters for some >>> backends (PostgreSQL, MySQL and probably others), the branch qualifier >>> is necessary if two databases from the cluster are used in the global >>> transaction. >>> >>> I think it is worth making the API such that it is easy to program to >>> best practices. >> The DB-API has always tried to not get in the way of how >> a particular backends needs its configuration data, so >> I think we can still have a single string using a database >> backend specific format. This could then include one or more >> of the above id parts. >> >> The implementation can then decode the string representation >> of the transaction id components into whatever format is >> needed by the backend. > > The two reasons I see for using an object to represent transactions > that contains a global part and branch part are: > > 1. round tripping a transaction ID from xa_recover() to > xa_commit()/xa_rollback(). > 2. Reduced restrictions on the contents of the transaction ID. > > For (1), using a database adapter defined object means that it can > represent transactions that originated elsewhere, or expose more > information about those transactions. > > For (2), if a database is using specially formatted transaction IDs at > the Python level that get decoded into the various components, does > that mean that the application or transaction manager glue needs to > know how to format the IDs. > > In contrast, it is pretty easy for e.g. a Postgres adapter to > serialise/deserialise a multi-part ID (and this is what the JDBC > driver does). I have no objections against using an object for this anymore, but let's please use an already existing object such as a tuple instead of having each database module implement its own new type. Given that the formatID is used for some purpose as well (probably just as identification of the TM itself), I guess we'd have to use a 3-tuple (format id, global transaction id, branch id). Modules should only expect to find an object that behaves like a 3-sequence, they should accept whatever object is passed to them and return it for the recover method. This leaves the door open for extensions used by the TM for XID objects. >>>> Regarding the "xa_" prefix, I'm not much attached to it, but since >>>> the interface does indeed look a lot like the XA interface, why not >>>> make that reference ? >>> Stuart's argument is that if the API differs from XA then using the >>> xa_* prefix could be problematic for adapters that want to expose the >>> XA API. >>> >>> As I don't have any experience with using XA, I can't comment one way >>> or the other about this. >> Fair enough. The API does resemble XA a lot, but you're right: >> if there are differences, it's better not to make that link. >> >>>> It also makes it clear, that the interface >>>> sits on top of the standard DB-API connection API and that those >>>> methods form a unit. >>> Having a common prefix seems sensible. If we don't use xa_*, >>> Federico's suggestion of tpc_* might make sense. >> Fine, let's use "tpc_". >> >>>> Plus they are currently not in use by any DB-API module, so don't >>>> interfere with existing APIs. >>> So I guess it comes down to the following questions: >>> 1. Are database adapters likely to want to expose more than what is >>> covered by this proposal? >>> 2. Would this proposed API conflict with those extensions? >>> >>> It isn't clear to me that people want to provide a larger API, since >>> the few adapters that have added 2PC support have done so with APIs >>> that are effectively a subset/simplification of this one. >> If there's more to expose than what's in the API spec, then >> module authors are free to do so. >> >> In general, the DB-API only >> defines a fully functional common subset of what has to be >> there to use a database backend. Extensions are possible and >> welcome. > > I agree with this, and think it is worth keeping extensibility in mind > when designing the API. My suggestion of using an object to represent > a transaction ID was to make it easier for an adapter to expose more > complex IDs in a fairly localised fashion. > > >> Every now and then, we consider adding those extensions as >> "standard extensions" to the DB-API. This has proven to work >> well in the past. >> >> The two-phase commit methods would be another set of those >> extensions. > > Okay. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 23 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From stuart at stuartbishop.net Wed Jan 23 14:11:53 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Wed, 23 Jan 2008 20:11:53 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <479704EE.8030404@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> Message-ID: <47973D19.2090904@stuartbishop.net> M.-A. Lemburg wrote: > On 2008-01-23 02:18, James Henstridge wrote: >>> [XID format used in XA] >>> So, essentially, only the global transaction id and the branch id >>> are relevant and both are represented in the data string. >> One interesting part of that is the "If OSI CCR naming is used, then >> the XID's formatID element should be set to 0; if some other format is >> used, then the formatID element should be greater than 0." >> >> I took a quick look at a few J2EE servers (which use XA), to see what >> they do for transaction managers. Neither JBoss or Geronimo seem to >> use formatID=0, but instead use magic numbers that I presume are >> intended to determine if they created the transaction ID. >> >> That said, the selection of format identifiers seems a bit ad-hoc: >> Geronimo uses 0x4765526f, which has a byte representation of "GeRo". >> >> It seems that you could do pretty much the same thing by getting TMs >> to check the global ID itself ... > > So we do need to store the "formatID" as well ? It looks like yes we do. MySQL's syntax for xids allows an optional formatid and this is returned by XA RECOVER. In MySQL, it is a number rather than a string. Assuming that any system that uses more than a simple string for the xid is doing so to map onto the XA specification, we could safely represent xids as a 3-tuple of (unicode, unicode, integer). How to deal with None's and empty strings needs to be thought out though to avoid round trip edge cases: >>> con = connect('') >>> xid = ('g', '', None) >>> con.tpc_begin(xid) >>> con.tpc_prepare() >>> con.tpc_recover() [('g', None, 1)] >>> con.tpc_recover()[0] == xid False '' and None for the gtid and brid would be equivalent, and 1 and None would be equivalent for the format_id (1 is the default format id in MySQL). To avoid round trip issues with tuples, only one of these values should be allowed. If we use an object, these issues go away: >>> con = connect('') >>> xid = Xid('g', '') >>> tuple(xid) ('g', None, 1) >>> con.tpc_begin(xid) >>> con.tpc_prepare() >>> con.tpc_recover() [] >>> con.tpc_recover()[0] == xid True > Given that the formatID is used for some purpose as well (probably > just as identification of the TM itself), I guess we'd have > to use a 3-tuple (format id, global transaction id, branch id). > > Modules should only expect to find an object that behaves like > a 3-sequence, they should accept whatever object is passed to > them and return it for the recover method. > > This leaves the door open for extensions used by the TM for XID > objects. I don't see a technical problem with the tuple apart from the round tripping issue above and someone might have a nice solution to that. Subjectively, I think an object reads better though, particularly as in many cases you will only want to bother specifying one or maybe two of the three parts. Xid('foo') vs. ('foo', None, None). Is CamelCase of xid 'Xid' or 'XID' or 'XId' ? -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080123/85777b53/attachment.pgp From mal at egenix.com Wed Jan 23 15:24:35 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 Jan 2008 15:24:35 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47973D19.2090904@stuartbishop.net> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <47973D19.2090904@stuartbishop.net> Message-ID: <47974E23.8000009@egenix.com> On 2008-01-23 14:11, Stuart Bishop wrote: > M.-A. Lemburg wrote: >> So we do need to store the "formatID" as well ? > > It looks like yes we do. MySQL's syntax for xids allows an optional formatid > and this is returned by XA RECOVER. In MySQL, it is a number rather than a > string. Assuming that any system that uses more than a simple string for the > xid is doing so to map onto the XA specification, we could safely represent > xids as a 3-tuple of (unicode, unicode, integer). > > How to deal with None's and empty strings needs to be thought out though to > avoid round trip edge cases: > >>>> con = connect('') >>>> xid = ('g', '', None) >>>> con.tpc_begin(xid) >>>> con.tpc_prepare() >>>> con.tpc_recover() > [('g', None, 1)] >>>> con.tpc_recover()[0] == xid > False > > '' and None for the gtid and brid would be equivalent, and 1 and None would > be equivalent for the format_id (1 is the default format id in MySQL). To > avoid round trip issues with tuples, only one of these values should be allowed. > > If we use an object, these issues go away: I'm not sure I understand... a tuple *is* an object after all :-) Why does '' get converted to None on output ? The database module should not try to change the object in any way (regardless of whether it's a string, tuple, custom sequence like object, etc.). At least that's the theory. Or is this a side-effect of MySQL doing some internal mapping of the tuple contents to some internal table ? >>>> con = connect('') >>>> xid = Xid('g', '') >>>> tuple(xid) > ('g', None, 1) >>>> con.tpc_begin(xid) >>>> con.tpc_prepare() >>>> con.tpc_recover() > [] >>>> con.tpc_recover()[0] == xid > True > >> Given that the formatID is used for some purpose as well (probably >> just as identification of the TM itself), I guess we'd have >> to use a 3-tuple (format id, global transaction id, branch id). >> >> Modules should only expect to find an object that behaves like >> a 3-sequence, they should accept whatever object is passed to >> them and return it for the recover method. >> >> This leaves the door open for extensions used by the TM for XID >> objects. > > I don't see a technical problem with the tuple apart from the round tripping > issue above and someone might have a nice solution to that. Subjectively, I > think an object reads better though, particularly as in many cases you will > only want to bother specifying one or maybe two of the three parts. > Xid('foo') vs. ('foo', None, None). I think we shouldn't restrict the TM by specifying a particular object. After all, the DB-API is about the RM, not the TM. However, it may be worthwhile to have the RM at least peek into the XID object and that's why I think we should require the XID object to implement the __getitem__ protocol and have the first three positions defined as (format id, global transaction id, branch id). This should leave enough room for the TM. > Is CamelCase of xid 'Xid' or 'XID' or 'XId' ? Good question. XID itself is an abbreviation. I tend to leave those alone and use all-capital-letters for classes. Note that since the TM will create the XIDs, we don't need to worry about a method or API to generate them. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 23 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From james at jamesh.id.au Thu Jan 24 02:44:27 2008 From: james at jamesh.id.au (James Henstridge) Date: Thu, 24 Jan 2008 10:44:27 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <479704EE.8030404@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> Message-ID: On 23/01/2008, M.-A. Lemburg wrote: > On 2008-01-23 02:18, James Henstridge wrote: > >> [XID format used in XA] > >> So, essentially, only the global transaction id and the branch id > >> are relevant and both are represented in the data string. > > > > One interesting part of that is the "If OSI CCR naming is used, then > > the XID's formatID element should be set to 0; if some other format is > > used, then the formatID element should be greater than 0." > > > > I took a quick look at a few J2EE servers (which use XA), to see what > > they do for transaction managers. Neither JBoss or Geronimo seem to > > use formatID=0, but instead use magic numbers that I presume are > > intended to determine if they created the transaction ID. > > > > That said, the selection of format identifiers seems a bit ad-hoc: > > Geronimo uses 0x4765526f, which has a byte representation of "GeRo". > > > > It seems that you could do pretty much the same thing by getting TMs > > to check the global ID itself ... > > So we do need to store the "formatID" as well ? > > >> BTW, there's a nice extension module that let's you hook Python > >> between the TM and RM using XA: > >> > >> http://www.hare.demon.co.uk/pyxasw/ > > > > > > > >>> I do see a use for the branch qualifier though. In a distributed > >>> transaction, each resource should have a different transaction ID that > >>> share a common global transaction ID but separate branch qualifiers. > >>> > >>> As transaction IDs are global within database clusters for some > >>> backends (PostgreSQL, MySQL and probably others), the branch qualifier > >>> is necessary if two databases from the cluster are used in the global > >>> transaction. > >>> > >>> I think it is worth making the API such that it is easy to program to > >>> best practices. > >> The DB-API has always tried to not get in the way of how > >> a particular backends needs its configuration data, so > >> I think we can still have a single string using a database > >> backend specific format. This could then include one or more > >> of the above id parts. > >> > >> The implementation can then decode the string representation > >> of the transaction id components into whatever format is > >> needed by the backend. > > > > The two reasons I see for using an object to represent transactions > > that contains a global part and branch part are: > > > > 1. round tripping a transaction ID from xa_recover() to > > xa_commit()/xa_rollback(). > > 2. Reduced restrictions on the contents of the transaction ID. > > > > For (1), using a database adapter defined object means that it can > > represent transactions that originated elsewhere, or expose more > > information about those transactions. > > > > For (2), if a database is using specially formatted transaction IDs at > > the Python level that get decoded into the various components, does > > that mean that the application or transaction manager glue needs to > > know how to format the IDs. > > > > In contrast, it is pretty easy for e.g. a Postgres adapter to > > serialise/deserialise a multi-part ID (and this is what the JDBC > > driver does). > > I have no objections against using an object for this anymore, > but let's please use an already existing object such as a > tuple instead of having each database module implement its own > new type. > > Given that the formatID is used for some purpose as well (probably > just as identification of the TM itself), I guess we'd have > to use a 3-tuple (format id, global transaction id, branch id). > > Modules should only expect to find an object that behaves like > a 3-sequence, they should accept whatever object is passed to > them and return it for the recover method. > > This leaves the door open for extensions used by the TM for XID > objects. I've had a bit more time to think about this, and have two proposals on how to handle transaction IDs. I think they offer equivalent functionality, so the choice comes down to what we want the API to look like. Proposal 1: * Plain string IDs should work fine as transaction identifiers for applications built from scratch with that assumption: they would need to identify the global and branch parts in their own way. * A plain string can be stuffed inside an XA style transaction identifier, even if it isn't making use of all the different components. * Therefore, all methods accepting transaction IDs should accept strings. * As some transaction IDs in the database might not match this simple form, there are two options for the recover() method: 1. return a special object that represents the transaction, which will be accepted by commit()/rollback(). How string-like must these objects be? 2. omit such transaction IDs from the result. * For databases that support more structured transaction IDs (such as those used by XA), the 2PC methods may accept objects other than strings. Proposal 2: * Many databases follow the XA specification, so it makes sense to use transaction identifiers structured in the same way. * For databases that do not use XA-style transaction IDs, it is usually possible to serialise such an ID into a form that it can work with. * Therefore, all methods accepting transaction IDs should accept 3-sequences of the form (formatID, gtrid, bqual). * For databases using non-XA transaction IDs, it is possible that some transaction IDs might exist that do not match the serialised form. The recover() method has two options: 1. return a special object representing the ID that will be accepted by commit()/rollback(). Such an object should act like a 3-sequence. 2. omit such transaction IDs from the result. * For databases not using XA-style transactions, the 2PC methods may accept objects other than 3-sequences as transaction IDs. Both of these proposals seem to get rid of the main points of contention: * removes the xid() constructor from the spec. * allow use of simple objects (strings or tuples) as transaction IDs * provides an obvious way to expose database-specific transaction IDs. James. From stuart at stuartbishop.net Thu Jan 24 07:05:28 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 24 Jan 2008 13:05:28 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47974E23.8000009@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <47973D19.2090904@stuartbishop.net> <47974E23.8000009@egenix.com> Message-ID: <47982AA8.5090502@stuartbishop.net> M.-A. Lemburg wrote: >> If we use an object, these issues go away: > > I'm not sure I understand... a tuple *is* an object after all :-) An object we can't define the constructor of. > Why does '' get converted to None on output ? Because if, say, on MySQL you do """XA PREPARE 'foo'""", MySQL will fill in the branchid and formatid with defaults - in MySQL's case '' and 1 respectively. > The database module > should not try to change the object in any way (regardless of whether > it's a string, tuple, custom sequence like object, etc.). At least > that's the theory. > > Or is this a side-effect of MySQL doing some internal mapping of > the tuple contents to some internal table ? The databases that support XA style xids have to be able to round trip with the defined C data structure. This structure is the formatid, the length of the global transaction id, the length of the branch id, and an array of bytes containing the concatenated ids. In this structure there is no way to differentiate a NULL from an empty string or a NULL formatid from whatever integer you map NULL to. I guess validation of the xid could be done by the driver in tpc_begin(), tpc_commit(), tpc_rollback() and an exception raised if the driver detects that round tripping via the database is not possible. > I think we shouldn't restrict the TM by specifying a particular > object. After all, the DB-API is about the RM, not the TM. I don't follow this. We have to specify what object can be passed to tpc_begin and is returned from tpc_recover. The only issue is if it is if we force this to be a 3-tuple or whatever the driver decides to return from a module level Xid() method. > However, it may be worthwhile to have the RM at least peek > into the XID object and that's why I think we should require > the XID object to implement the __getitem__ protocol and > have the first three positions defined as (format id, > global transaction id, branch id). I wouldn't say 'may be worthwhile'. I'd go for 'is essential'. If you can't inspect the results from tpc_recover(), the method is pointless. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/ce425f43/attachment.pgp From stuart at stuartbishop.net Thu Jan 24 08:21:29 2008 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 24 Jan 2008 14:21:29 +0700 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> Message-ID: <47983C79.4080700@stuartbishop.net> James Henstridge wrote: > Proposal 1: > * Plain string IDs should work fine as transaction identifiers for > applications built from scratch with that assumption: they would > need to identify the global and branch parts in their own way. > > * A plain string can be stuffed inside an XA style transaction > identifier, even if it isn't making use of all the different > components. > > * Therefore, all methods accepting transaction IDs should accept > strings. > > * As some transaction IDs in the database might not match this simple > form, there are two options for the recover() method: > 1. return a special object that represents the transaction, which > will be accepted by commit()/rollback(). How string-like must > these objects be? > 2. omit such transaction IDs from the result. > > * For databases that support more structured transaction IDs (such as > those used by XA), the 2PC methods may accept objects other than > strings. > > Proposal 2: > > * Many databases follow the XA specification, so it makes sense to use > transaction identifiers structured in the same way. > > * For databases that do not use XA-style transaction IDs, it is > usually possible to serialise such an ID into a form that it can > work with. > > * Therefore, all methods accepting transaction IDs should accept > 3-sequences of the form (formatID, gtrid, bqual). > > * For databases using non-XA transaction IDs, it is possible that some > transaction IDs might exist that do not match the serialised form. > The recover() method has two options: > 1. return a special object representing the ID that will be > accepted by commit()/rollback(). Such an object should act > like a 3-sequence. > 2. omit such transaction IDs from the result. > > * For databases not using XA-style transactions, the 2PC methods may > accept objects other than 3-sequences as transaction IDs. > > > Both of these proposals seem to get rid of the main points of contention: > * removes the xid() constructor from the spec. > * allow use of simple objects (strings or tuples) as transaction IDs > * provides an obvious way to expose database-specific transaction IDs. I wouldn't call any of these a point of contention. They where points of discussion. Attempting to remove the xid() constructor from the spec is premature when people where just considering if tuples can be used instead. I don't think omitting transaction ids from tpc_recover() is acceptable. Doing so means you can't write a transaction manager that plays nicely in a more complex environment where components may not be under our direct control, let alone written in Python and using ths API. My use case here is a reaper script that detects and handles or reports lost transactions. Here is an edge case with proposal 1. Here, con happens to be a connection to a MySQL database. Which Xid represents the prepared transaction? >>> con.tpc_begin('foo') >>> con.tpc_prepare() >>> con.tpc_recover() [, , ] You could try fixing this by returning a heterogeneous list, but I think this is just making the hole deeper: >>> con.tpc_begin('foo') >>> con.tpc_prepare() >>> con.tpc_recover() ['foo', , ] Proposal 2 seems the better option. I think we need to specify that the 3-tuple cannot contain None values. I personally feel that an Xid() constructor makes things more readable. It also means we can have driver specific defaults for the format id rather than no default. tpc_begin(Xid('foo', 'bar', 1)) vs. tpc_begin(('foo', 'bar', 1)) tpc_begin(Xid('foo', 'bar')) vs. tpc_begin(('foo', 'bar', 1)) tpc_begin(Xid('foo')) vs. tpc_begin(('foo', '', 1)) -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/bf639ea4/attachment-0001.pgp From james at jamesh.id.au Thu Jan 24 09:50:32 2008 From: james at jamesh.id.au (James Henstridge) Date: Thu, 24 Jan 2008 17:50:32 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47983C79.4080700@stuartbishop.net> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <47983C79.4080700@stuartbishop.net> Message-ID: On 24/01/2008, Stuart Bishop wrote: > James Henstridge wrote: > > > Proposal 1: > > * Plain string IDs should work fine as transaction identifiers for > > applications built from scratch with that assumption: they would > > need to identify the global and branch parts in their own way. > > > > * A plain string can be stuffed inside an XA style transaction > > identifier, even if it isn't making use of all the different > > components. > > > > * Therefore, all methods accepting transaction IDs should accept > > strings. > > > > * As some transaction IDs in the database might not match this simple > > form, there are two options for the recover() method: > > 1. return a special object that represents the transaction, which > > will be accepted by commit()/rollback(). How string-like must > > these objects be? > > 2. omit such transaction IDs from the result. > > > > * For databases that support more structured transaction IDs (such as > > those used by XA), the 2PC methods may accept objects other than > > strings. > > > > Proposal 2: > > > > * Many databases follow the XA specification, so it makes sense to use > > transaction identifiers structured in the same way. > > > > * For databases that do not use XA-style transaction IDs, it is > > usually possible to serialise such an ID into a form that it can > > work with. > > > > * Therefore, all methods accepting transaction IDs should accept > > 3-sequences of the form (formatID, gtrid, bqual). > > > > * For databases using non-XA transaction IDs, it is possible that some > > transaction IDs might exist that do not match the serialised form. > > The recover() method has two options: > > 1. return a special object representing the ID that will be > > accepted by commit()/rollback(). Such an object should act > > like a 3-sequence. > > 2. omit such transaction IDs from the result. > > > > * For databases not using XA-style transactions, the 2PC methods may > > accept objects other than 3-sequences as transaction IDs. > > > > > > Both of these proposals seem to get rid of the main points of contention: > > * removes the xid() constructor from the spec. > > * allow use of simple objects (strings or tuples) as transaction IDs > > * provides an obvious way to expose database-specific transaction IDs. > > I wouldn't call any of these a point of contention. They where points of > discussion. Attempting to remove the xid() constructor from the spec is > premature when people where just considering if tuples can be used instead. > > I don't think omitting transaction ids from tpc_recover() is acceptable. > Doing so means you can't write a transaction manager that plays nicely in a > more complex environment where components may not be under our direct > control, let alone written in Python and using ths API. My use case here is > a reaper script that detects and handles or reports lost transactions. > > Here is an edge case with proposal 1. Here, con happens to be a connection > to a MySQL database. Which Xid represents the prepared transaction? > > >>> con.tpc_begin('foo') > >>> con.tpc_prepare() > >>> con.tpc_recover() > [, , ] If we were going with proposal 1 (defaulting to strings as transaction IDs), it would be the one that compares equal to "foo". The exact answer would depend on how the database adapter was implemented. > You could try fixing this by returning a heterogeneous list, but I think > this is just making the hole deeper: > > >>> con.tpc_begin('foo') > >>> con.tpc_prepare() > >>> con.tpc_recover() > ['foo', , ] In this case, the answer is still "the one that compares equal to 'foo'". > Proposal 2 seems the better option. I think we need to specify that the > 3-tuple cannot contain None values. I suppose working with transaction IDs that couldn't be deserialised might be easier with proposal 2. For example, it could provide the raw ID in one part and leave the other two None. For proposal 2, I think we should stick to XA-compatible IDs. That is, formatID a number >= 0, and the global ID and branch qualifier as strings no longer than 64 characters each. > I personally feel that an Xid() constructor makes things more readable. It > also means we can have driver specific defaults for the format id rather > than no default. > > tpc_begin(Xid('foo', 'bar', 1)) vs. tpc_begin(('foo', 'bar', 1)) > tpc_begin(Xid('foo', 'bar')) vs. tpc_begin(('foo', 'bar', 1)) > tpc_begin(Xid('foo')) vs. tpc_begin(('foo', '', 1)) I don't know if adapter-specific defaults make sense. Perhaps pick the defaults from MySQL? """ As indicated by the syntax, bqual and formatID are optional. The default bqual value is '' if not given. The default formatID value is 1 if not given. """ If we do have a transaction ID constructor, I think it should be a method on the connection. You can make use of pretty much the entire DB-API using just a connection as an entry point (especially if the exceptions are provided as connection attributes). It seems sensible to do the same here. James. From fog at initd.org Thu Jan 24 09:58:59 2008 From: fog at initd.org (Federico Di Gregorio) Date: Thu, 24 Jan 2008 09:58:59 +0100 Subject: [DB-SIG] XID format (was: Two-phase commit API proposal) In-Reply-To: <47983C79.4080700@stuartbishop.net> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <47983C79.4080700@stuartbishop.net> Message-ID: <1201165139.6139.14.camel@mila.office.dinunzioedigregorio> The problem here seems to be that simple strings should be supported (xid, in the end are simple SQL strings for most backends) while it should be possible to access single parts (format, gtrid, bqual) to play well with the transaction managers. The thing to notice is that even if you mix the two styles, after you compose the parts in the final xid, no two xids can be the same string. So, what about using a 4-tuple? (full, format, gtrid, bqual) The application layer can pass just the 'full' parameter (a sting) representing the xid directly, or set 'full' to None and let the driver build the string out of the other three parts (and fill 'full' for later reference.) recover() returns a tuple with the 'full' slot always valorized and, if it is possible it also fills the other three slots parsing the xid. This way one has access to the full xid and if it was built from parts to the single parts too. A transaction manager can discover if a recovered() transaction belongs to it by checking the 'format' (it can be None) and there is no need to drop xids from recover() calls. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org Purtroppo i creazionisti non si sono ancora estinti. -- vodka -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20080124/acba9f62/attachment.pgp From dieter at handshake.de Tue Jan 22 19:48:52 2008 From: dieter at handshake.de (Dieter Maurer) Date: Tue, 22 Jan 2008 19:48:52 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> Message-ID: <18326.14996.49402.907419@gargle.gargle.HOWL> James Henstridge wrote at 2008-1-22 20:33 +0900: > ... >I do see a use for the branch qualifier though. In a distributed >transaction, each resource should have a different transaction ID Why? Why is it not equally good to use a common transaction id for all resource managers? >that >share a common global transaction ID but separate branch qualifiers. > >As transaction IDs are global within database clusters for some >backends (PostgreSQL, MySQL and probably others), the branch qualifier >is necessary if two databases from the cluster are used in the global >transaction. They refer to the same transaction -- even when several databases in a cluster are affected. The transaction as a whole will want to get prepared, committed, rolledback... -- Dieter From james at jamesh.id.au Thu Jan 24 15:16:29 2008 From: james at jamesh.id.au (James Henstridge) Date: Thu, 24 Jan 2008 23:16:29 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18326.18926.708382.939543@gargle.gargle.HOWL> References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> <18326.18926.708382.939543@gargle.gargle.HOWL> Message-ID: On 23/01/2008, Dieter Maurer wrote: > M.-A. Lemburg wrote at 2008-1-22 20:31 +0100: > > ... > >Branch ids are used for e.g. multiple connections of the same RM > >engaging in a global transaction. Each of those connections gets > >its own branch id. > > But using multiple connections to the same RM seems to > be an error in the first place. > > Assume that a resource "R" is locked via connection "C1". > Assume than that "R" is requested via connection "C2". > > If "C1 == C2", then the RM can see that the resource is already > assigned to the connection and there is no blocking. > > Otherwise, the RM has not chance to recognize this and > the request will be blocked until the transaction is commited > or rolled back. There is quite a high chance, that since the > "R" request is blocked, there will be no commit/roll back.... Here is a concrete example: 1. create two databases on a single PostgreSQL install. 2. write an application that connects to each database (which implies two connections). 3. try to prepare transactions on each connection using the same transaction identifier. One of the transactions will fail with a "transaction identifier is already in use" error. While each connection is accessing independent resources, the transaction ID namespace is shared by all databases in the cluster. Now if you include a branch qualifier in the transaction IDs the problem is avoided. The MySQL documentation leads me to believe it behaves similarly. James. From mal at egenix.com Thu Jan 24 15:33:10 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 24 Jan 2008 15:33:10 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <47982AA8.5090502@stuartbishop.net> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <47973D19.2090904@stuartbishop.net> <47974E23.8000009@egenix.com> <47982AA8.5090502@stuartbishop.net> Message-ID: <4798A1A6.7080701@egenix.com> On 2008-01-24 07:05, Stuart Bishop wrote: > M.-A. Lemburg wrote: > >>> If we use an object, these issues go away: >> I'm not sure I understand... a tuple *is* an object after all :-) > > An object we can't define the constructor of. > >> Why does '' get converted to None on output ? > > Because if, say, on MySQL you do """XA PREPARE 'foo'""", MySQL will fill in > the branchid and formatid with defaults - in MySQL's case '' and 1 > respectively. > >> The database module >> should not try to change the object in any way (regardless of whether >> it's a string, tuple, custom sequence like object, etc.). At least >> that's the theory. >> >> Or is this a side-effect of MySQL doing some internal mapping of >> the tuple contents to some internal table ? > > The databases that support XA style xids have to be able to round trip with > the defined C data structure. This structure is the formatid, the length of > the global transaction id, the length of the branch id, and an array of > bytes containing the concatenated ids. In this structure there is no way to > differentiate a NULL from an empty string or a NULL formatid from whatever > integer you map NULL to. > > I guess validation of the xid could be done by the driver in tpc_begin(), > tpc_commit(), tpc_rollback() and an exception raised if the driver detects > that round tripping via the database is not possible. It is the database module's responsibility to make sure that the xid can round-trip. If we restrict the three entries of the xid tuple to be strings, this should be easily possible by e.g. * combining the three strings into one and decoding this combination again in .tpc_recover() * mapping the components to ids/values that the database backend can handle and undoing this mapping in .tpc_recover() * not passing the ids to the database backend at all and managing the xid at the database module level >> I think we shouldn't restrict the TM by specifying a particular >> object. After all, the DB-API is about the RM, not the TM. > > I don't follow this. We have to specify what object can be passed to > tpc_begin and is returned from tpc_recover. The only issue is if it is if we > force this to be a 3-tuple or whatever the driver decides to return from a > module level Xid() method. The important aspect is that the TM must be able to get back an object that it can compare against whatever it originally passed to the database module. Perhaps we could have the TM do something along these lines: # From the TM: xid = conn.xid(fid, gid, bid) conn.tpc_begin(xid) conn.tpc_prepare(xid) ... # See whether there are pending transactions: xids = conn.tpc_recover() # Recover only those transactions that the TM has initiated: for (fid, gid, bid) in xids: if tm_check_xid(fid, gid, bid): tm_do_recovery(fid, gid, bid) >> However, it may be worthwhile to have the RM at least peek >> into the XID object and that's why I think we should require >> the XID object to implement the __getitem__ protocol and >> have the first three positions defined as (format id, >> global transaction id, branch id). > > I wouldn't say 'may be worthwhile'. I'd go for 'is essential'. If you can't > inspect the results from tpc_recover(), the method is pointless. Agreed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 24 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Thu Jan 24 15:36:30 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 24 Jan 2008 15:36:30 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> Message-ID: <4798A26E.2030201@egenix.com> On 2008-01-24 02:44, James Henstridge wrote: > I've had a bit more time to think about this, and have two proposals > on how to handle transaction IDs. I think they offer equivalent > functionality, so the choice comes down to what we want the API to > look like. > > Proposal 1: > * Plain string IDs should work fine as transaction identifiers for > applications built from scratch with that assumption: they would > need to identify the global and branch parts in their own way. > > * A plain string can be stuffed inside an XA style transaction > identifier, even if it isn't making use of all the different > components. > > * Therefore, all methods accepting transaction IDs should accept > strings. > > * As some transaction IDs in the database might not match this simple > form, there are two options for the recover() method: > 1. return a special object that represents the transaction, which > will be accepted by commit()/rollback(). How string-like must > these objects be? > 2. omit such transaction IDs from the result. > > * For databases that support more structured transaction IDs (such as > those used by XA), the 2PC methods may accept objects other than > strings. > > > Proposal 2: > > * Many databases follow the XA specification, so it makes sense to use > transaction identifiers structured in the same way. > > * For databases that do not use XA-style transaction IDs, it is > usually possible to serialise such an ID into a form that it can > work with. > > * Therefore, all methods accepting transaction IDs should accept > 3-sequences of the form (formatID, gtrid, bqual). > > * For databases using non-XA transaction IDs, it is possible that some > transaction IDs might exist that do not match the serialised form. > The recover() method has two options: > 1. return a special object representing the ID that will be > accepted by commit()/rollback(). Such an object should act > like a 3-sequence. > 2. omit such transaction IDs from the result. > > * For databases not using XA-style transactions, the 2PC methods may > accept objects other than 3-sequences as transaction IDs. > > > Both of these proposals seem to get rid of the main points of contention: > * removes the xid() constructor from the spec. > * allow use of simple objects (strings or tuples) as transaction IDs > * provides an obvious way to expose database-specific transaction IDs. I'm coming to agree with Stuart that the conn.xid() might actually help us with this. So I'd be in favor of proposal 2 and an .xid() constructor that returns an object which provides a 3-sequence interface, e.g. # Wrap the IDs for use by the database module xid = conn.xid(fid, gid, bid) # Use the xid conn.tpc_begin(xid) conn.tpc_prepare(xid) ... # Unwrap the IDs: fid, gid, bid = xid -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 24 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Thu Jan 24 15:41:09 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 24 Jan 2008 15:41:09 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4798A26E.2030201@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <4798A26E.2030201@egenix.com> Message-ID: <4798A385.80008@egenix.com> On 2008-01-24 15:36, M.-A. Lemburg wrote: > On 2008-01-24 02:44, James Henstridge wrote: >> I've had a bit more time to think about this, and have two proposals >> on how to handle transaction IDs. I think they offer equivalent >> functionality, so the choice comes down to what we want the API to >> look like. >> >> Proposal 1: >> * Plain string IDs should work fine as transaction identifiers for >> applications built from scratch with that assumption: they would >> need to identify the global and branch parts in their own way. >> >> * A plain string can be stuffed inside an XA style transaction >> identifier, even if it isn't making use of all the different >> components. >> >> * Therefore, all methods accepting transaction IDs should accept >> strings. >> >> * As some transaction IDs in the database might not match this simple >> form, there are two options for the recover() method: >> 1. return a special object that represents the transaction, which >> will be accepted by commit()/rollback(). How string-like must >> these objects be? >> 2. omit such transaction IDs from the result. >> >> * For databases that support more structured transaction IDs (such as >> those used by XA), the 2PC methods may accept objects other than >> strings. >> >> >> Proposal 2: >> >> * Many databases follow the XA specification, so it makes sense to use >> transaction identifiers structured in the same way. >> >> * For databases that do not use XA-style transaction IDs, it is >> usually possible to serialise such an ID into a form that it can >> work with. >> >> * Therefore, all methods accepting transaction IDs should accept >> 3-sequences of the form (formatID, gtrid, bqual). >> >> * For databases using non-XA transaction IDs, it is possible that some >> transaction IDs might exist that do not match the serialised form. >> The recover() method has two options: >> 1. return a special object representing the ID that will be >> accepted by commit()/rollback(). Such an object should act >> like a 3-sequence. >> 2. omit such transaction IDs from the result. >> >> * For databases not using XA-style transactions, the 2PC methods may >> accept objects other than 3-sequences as transaction IDs. >> >> >> Both of these proposals seem to get rid of the main points of contention: >> * removes the xid() constructor from the spec. >> * allow use of simple objects (strings or tuples) as transaction IDs >> * provides an obvious way to expose database-specific transaction IDs. > > I'm coming to agree with Stuart that the conn.xid() might actually > help us with this. > > So I'd be in favor of proposal 2 and an .xid() constructor that > returns an object which provides a 3-sequence interface, e.g. > > # Wrap the IDs for use by the database module > xid = conn.xid(fid, gid, bid) > > # Use the xid > conn.tpc_begin(xid) > conn.tpc_prepare(xid) > ... > > # Unwrap the IDs: > fid, gid, bid = xid Plus require that all three components are strings to avoid the None issue. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 24 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From dieter at handshake.de Thu Jan 24 18:47:03 2008 From: dieter at handshake.de (Dieter Maurer) Date: Thu, 24 Jan 2008 18:47:03 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> <18326.18926.708382.939543@gargle.gargle.HOWL> Message-ID: <18328.53015.139548.358620@gargle.gargle.HOWL> James Henstridge wrote at 2008-1-24 23:16 +0900: >On 23/01/2008, Dieter Maurer wrote: > ... >Here is a concrete example: > >1. create two databases on a single PostgreSQL install. >2. write an application that connects to each database (which implies >two connections). >3. try to prepare transactions on each connection using the same >transaction identifier. > >One of the transactions will fail with a "transaction identifier is >already in use" error. While each connection is accessing independent >resources, the transaction ID namespace is shared by all databases in >the cluster. > >Now if you include a branch qualifier in the transaction IDs the >problem is avoided. The MySQL documentation leads me to believe it >behaves similarly. This description suggests that the TM provides the "main" transaction identifier and the resource manager could add the branch part. -- Dieter From james at jamesh.id.au Fri Jan 25 01:48:10 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 25 Jan 2008 09:48:10 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <18328.53015.139548.358620@gargle.gargle.HOWL> References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> <18326.18926.708382.939543@gargle.gargle.HOWL> <18328.53015.139548.358620@gargle.gargle.HOWL> Message-ID: On 25/01/2008, Dieter Maurer wrote: > This description suggests that the TM provides the "main" transaction > identifier and the resource manager could add the branch part. I guess the reason why the TM generally assigns the branch qualifiers in XA systems is that it is in the best place to do so: it can simply issue sequential numbers to each resource that joins the transaction. An RM has no knowledge of what other branch qualifiers have been used so would need to do something more comlpex. Now whether the TM or RM generates the branch qualifier, I'd expect that the TM needs to know all the full transaction IDs if it is to properly handle recovery. If the RM is generating the ID, then the TM would now need some way to retrieve that ID. James. From james at jamesh.id.au Fri Jan 25 01:54:40 2008 From: james at jamesh.id.au (James Henstridge) Date: Fri, 25 Jan 2008 09:54:40 +0900 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: <4798A385.80008@egenix.com> References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <4798A26E.2030201@egenix.com> <4798A385.80008@egenix.com> Message-ID: On 24/01/2008, M.-A. Lemburg wrote: > On 2008-01-24 15:36, M.-A. Lemburg wrote: > > On 2008-01-24 02:44, James Henstridge wrote: > >> I've had a bit more time to think about this, and have two proposals > >> on how to handle transaction IDs. I think they offer equivalent > >> functionality, so the choice comes down to what we want the API to > >> look like. > >> > >> Proposal 1: > >> * Plain string IDs should work fine as transaction identifiers for > >> applications built from scratch with that assumption: they would > >> need to identify the global and branch parts in their own way. > >> > >> * A plain string can be stuffed inside an XA style transaction > >> identifier, even if it isn't making use of all the different > >> components. > >> > >> * Therefore, all methods accepting transaction IDs should accept > >> strings. > >> > >> * As some transaction IDs in the database might not match this simple > >> form, there are two options for the recover() method: > >> 1. return a special object that represents the transaction, which > >> will be accepted by commit()/rollback(). How string-like must > >> these objects be? > >> 2. omit such transaction IDs from the result. > >> > >> * For databases that support more structured transaction IDs (such as > >> those used by XA), the 2PC methods may accept objects other than > >> strings. > >> > >> > >> Proposal 2: > >> > >> * Many databases follow the XA specification, so it makes sense to use > >> transaction identifiers structured in the same way. > >> > >> * For databases that do not use XA-style transaction IDs, it is > >> usually possible to serialise such an ID into a form that it can > >> work with. > >> > >> * Therefore, all methods accepting transaction IDs should accept > >> 3-sequences of the form (formatID, gtrid, bqual). > >> > >> * For databases using non-XA transaction IDs, it is possible that some > >> transaction IDs might exist that do not match the serialised form. > >> The recover() method has two options: > >> 1. return a special object representing the ID that will be > >> accepted by commit()/rollback(). Such an object should act > >> like a 3-sequence. > >> 2. omit such transaction IDs from the result. > >> > >> * For databases not using XA-style transactions, the 2PC methods may > >> accept objects other than 3-sequences as transaction IDs. > >> > >> > >> Both of these proposals seem to get rid of the main points of contention: > >> * removes the xid() constructor from the spec. > >> * allow use of simple objects (strings or tuples) as transaction IDs > >> * provides an obvious way to expose database-specific transaction IDs. > > > > I'm coming to agree with Stuart that the conn.xid() might actually > > help us with this. > > > > So I'd be in favor of proposal 2 and an .xid() constructor that > > returns an object which provides a 3-sequence interface, e.g. So is the 3-sequence behaviour intended to allow application code to inspect a transaction ID, or are tpc_begin(), etc expected to accept arbitrary 3-sequences too? > > > > # Wrap the IDs for use by the database module > > xid = conn.xid(fid, gid, bid) > > > > # Use the xid > > conn.tpc_begin(xid) > > conn.tpc_prepare(xid) > > ... > > > > # Unwrap the IDs: > > fid, gid, bid = xid > > Plus require that all three components are strings to avoid the > None issue. If we are going with 3-part XA-style transaction IDs, the format ID should be a non-negative 32-bit integer and the other two should be strings with a maximum length of 64 bytes (possibly with some restrictions on allowed characters?). James. From mal at egenix.com Fri Jan 25 10:45:52 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 25 Jan 2008 10:45:52 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <4795D9E4.2030509@egenix.com> <479704EE.8030404@egenix.com> <4798A26E.2030201@egenix.com> <4798A385.80008@egenix.com> Message-ID: <4799AFD0.3000905@egenix.com> On 2008-01-25 01:54, James Henstridge wrote: >>>> Proposal 2: >>>> >>>> * Many databases follow the XA specification, so it makes sense to use >>>> transaction identifiers structured in the same way. >>>> >>>> * For databases that do not use XA-style transaction IDs, it is >>>> usually possible to serialise such an ID into a form that it can >>>> work with. >>>> >>>> * Therefore, all methods accepting transaction IDs should accept >>>> 3-sequences of the form (formatID, gtrid, bqual). >>>> >>>> * For databases using non-XA transaction IDs, it is possible that some >>>> transaction IDs might exist that do not match the serialised form. >>>> The recover() method has two options: >>>> 1. return a special object representing the ID that will be >>>> accepted by commit()/rollback(). Such an object should act >>>> like a 3-sequence. >>>> 2. omit such transaction IDs from the result. >>>> >>>> * For databases not using XA-style transactions, the 2PC methods may >>>> accept objects other than 3-sequences as transaction IDs. >>>> >>>> >>>> Both of these proposals seem to get rid of the main points of contention: >>>> * removes the xid() constructor from the spec. >>>> * allow use of simple objects (strings or tuples) as transaction IDs >>>> * provides an obvious way to expose database-specific transaction IDs. >>> I'm coming to agree with Stuart that the conn.xid() might actually >>> help us with this. >>> >>> So I'd be in favor of proposal 2 and an .xid() constructor that >>> returns an object which provides a 3-sequence interface, e.g. > > So is the 3-sequence behaviour intended to allow application code to > inspect a transaction ID, or are tpc_begin(), etc expected to accept > arbitrary 3-sequences too? I'd say we put the .xid() as interface between the TM and the .tpc_*() methods, like Stuart suggested. That way, the TM has a clear interface to construct an XID interface, while the RM has control over what is passed to its .tpc_*() methods and can also use other means of creating these object (if needed). By using the 3-sequence interface, the TM can also easily recover the data it passed to the .xid() constructor when getting back data from .tpc_recover(), so it is round-trip safe. >>> # Wrap the IDs for use by the database module >>> xid = conn.xid(fid, gid, bid) >>> >>> # Use the xid >>> conn.tpc_begin(xid) >>> conn.tpc_prepare(xid) >>> ... >>> >>> # Unwrap the IDs: >>> fid, gid, bid = xid >> Plus require that all three components are strings to avoid the >> None issue. > > If we are going with 3-part XA-style transaction IDs, the format ID > should be a non-negative 32-bit integer and the other two should be > strings with a maximum length of 64 bytes (possibly with some > restrictions on allowed characters?). Ok, if that's the GCD of what backends use, let's go with that. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 25 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From dieter at handshake.de Fri Jan 25 20:47:29 2008 From: dieter at handshake.de (Dieter Maurer) Date: Fri, 25 Jan 2008 20:47:29 +0100 Subject: [DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?) In-Reply-To: References: <4795C6CE.8060301@egenix.com> <18326.14996.49402.907419@gargle.gargle.HOWL> <47964482.5030309@egenix.com> <18326.18926.708382.939543@gargle.gargle.HOWL> <18328.53015.139548.358620@gargle.gargle.HOWL> Message-ID: <18330.15569.288412.903184@gargle.gargle.HOWL> James Henstridge wrote at 2008-1-25 09:48 +0900: >On 25/01/2008, Dieter Maurer wrote: >> This description suggests that the TM provides the "main" transaction >> identifier and the resource manager could add the branch part. > >I guess the reason why the TM generally assigns the branch qualifiers >in XA systems is that it is in the best place to do so: it can simply >issue sequential numbers to each resource that joins the transaction. >An RM has no knowledge of what other branch qualifiers have been used >so would need to do something more comlpex. It could identify itself -- and then there would be no need to know other branch qualifiers. >Now whether the TM or RM generates the branch qualifier, I'd expect >that the TM needs to know all the full transaction IDs if it is to >properly handle recovery. If the RM is generating the ID, then the TM >would now need some way to retrieve that ID. The "conn.xid" could provide the part identifying "conn". -- Dieter From szybalski at gmail.com Thu Jan 31 15:47:26 2008 From: szybalski at gmail.com (Lukasz Szybalski) Date: Thu, 31 Jan 2008 08:47:26 -0600 Subject: [DB-SIG] db to db layout analysis Message-ID: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> Hello, I just came across this database with over 60 tables and I need some tool to analyze the tables. (find out keys, fields, properties, show me relation to other tables etc.) You guys know of something similar? python or not, command line or not Thanks, Lucas From fabien.coutant at neuf.fr Thu Jan 31 18:55:05 2008 From: fabien.coutant at neuf.fr (Fabien COUTANT) Date: Thu, 31 Jan 2008 18:55:05 +0100 Subject: [DB-SIG] db to db layout analysis In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> Message-ID: <20080131185505.929a8fb4.fabien.coutant@neuf.fr> Le Thu, 31 Jan 2008 08:47:26 -0600, Lukasz Szybalski a ?crit: > Hello, > I just came across this database with over 60 tables and I need some > tool to analyze the tables. (find out keys, fields, properties, show > me relation to other tables etc.) > > You guys know of something similar? python or not, command line or not Hi, Note this is a general database question, not Python-specific (which is the subject of this list). However I will suggest http://squirrel-sql.sourceforge.net/ if you can accept a Java/GUI program... I use it daily and I think it has the features you ask for. There's even a plugin that will make a drawing of your tables relationships. -- Hope this helps, Fabien. From andy47 at halfcooked.com Thu Jan 31 21:18:13 2008 From: andy47 at halfcooked.com (Andy Todd) Date: Fri, 01 Feb 2008 07:18:13 +1100 Subject: [DB-SIG] db to db layout analysis In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> Message-ID: <47A22D05.6080008@halfcooked.com> Lukasz Szybalski wrote: > Hello, > I just came across this database with over 60 tables and I need some > tool to analyze the tables. (find out keys, fields, properties, show > me relation to other tables etc.) > > You guys know of something similar? python or not, command line or not > > Thanks, > Lucas > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > http://mail.python.org/mailman/listinfo/db-sig http://halfcooked.com/code/gerald Regards, Andy -- From the desk of Andrew J Todd esq - http://www.halfcooked.com/ From Frederic.VanderElst at phgroup.com Thu Jan 31 22:55:46 2008 From: Frederic.VanderElst at phgroup.com (Frederic Vander Elst) Date: Thu, 31 Jan 2008 21:55:46 +0000 Subject: [DB-SIG] db to db layout analysis In-Reply-To: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> References: <804e5c70801310647sddae776tf69f28763477c177@mail.gmail.com> Message-ID: <47A243E2.3020902@phgroup.com> Lukasz I have used Schema Spy (java, produces pretty html docs and drawings, interpreting foreign keys, etc), and would heartily recommend it. See http://schemaspy.sourceforge.net/ -f Lukasz Szybalski wrote: > Hello, > I just came across this database with over 60 tables and I need some > tool to analyze the tables. (find out keys, fields, properties, show > me relation to other tables etc.) > > You guys know of something similar? python or not, command line or not > > Thanks, > Lucas > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > http://mail.python.org/mailman/listinfo/db-sig > -- -------------------------- Frederic Vander Elst pH, an Experian Company www.phgroup.com Direct Line: 020 7598 0320 Office Line: 020 7598 0310 Fax: 020 7598 0311 --------------------------