From gafton at rpath.com Tue Sep 5 23:39:28 2006 From: gafton at rpath.com (Cristian Gafton) Date: Tue, 5 Sep 2006 17:39:28 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python Message-ID: A few days ago I started working on the PyGreSQL bindings by adding support for bind parameters, prepared statements and server side cursors. To make a longer story shorter, as I progressed through this task, the changes I have been making to PyGreSQL became more and more extensive, modifying the original's behavior in incompatible ways. I didn't solve the challenge of preserving PyGreSQL's existing behavior, seeing how the native PostgreSQL bind parameters syntax ($1, $2, etc) is very different from what PyGreSQL simulated (python-style %s or %(var)s). (although it would be possible to do with some regexp foo) I have put the full details about what I did on my page here: http://blogs.conary.com/index.php/gafton The bindings themselves (also including a demo.py that shows how I think they should be used) can be downloaded from here: http://people.rpath.com/~gafton/pgsql/ I have seen the recent discussions about the DB API 2.0 shortcomings when it comes to working with modern databases. I tend to agree that the current API serves more of a guideline - it specifies the bare minimum, on top of which everybody keeps reinventing the same extensions. So, what I would like to see covered by a future DB API spec: - bind parameters. The current pythonesque %(name)s specification is not ideal. It requires various levels of pain in escaping the arguments passed to a query. Most database backends accept bind parameters, albeit with different syntaxes. For code portability, I'd rather parse the SQL query and rewrite it to use the proper element for the bound parameter than do the crazy escaping on the user input that is currently being done. (As a side note, my bindings use PostgreSQL's native support to do things like cu.execute("select * from foo where id = $1", x), without having to worry about escaping x) - parsed statements. On large loops this is a real gain. For lack of a better specification, I currently use something like: prepcu = db.prepare("select * from foo where id = $1") prepcu.execute(2) The prepared cursor statement has the same fetchone(), fetchall() and other properties of the regular cursors, except it only accepts parameters to its execute() and executemany() calls. - server side cursors. Currently, most bindings for most databases have to decide what to do after an cursor.execute() call - do they automatically retrieve all the resulting rows in the client's memory, or do they retrieve it row by row, pinging the server before every retrieval to get more data (hey, not everybody using Oracle ;^). DB API has no support for controlling this in a consistent fashion, even though Python has solved the issue of dict.items() vs dict.iteritems() a long time ago. The application writers should have a choice on how the cursors will behave. (Again, in my bindings, I have added support for db.iteritems() to get a cursor that will retrieve rows from the server side in chunks instead of all at once. I left db.cursor() to return a cursor which will download all results in the client memory after an execute - which seems to be the prevailing default) Cristian -- Cristian Gafton rPath, Inc. From fog at initd.org Wed Sep 6 00:37:56 2006 From: fog at initd.org (Federico Di Gregorio) Date: Wed, 06 Sep 2006 00:37:56 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: Message-ID: <1157495876.3792.16.camel@localhost> Il giorno mar, 05/09/2006 alle 17.39 -0400, Cristian Gafton ha scritto: > - server side cursors. Currently, most bindings for most databases have to > decide what to do after an cursor.execute() call - do they automatically > retrieve all the resulting rows in the client's memory, or do they > retrieve it row by row, pinging the server before every retrieval to get > more data (hey, not everybody using Oracle ;^). DB API has no support > for controlling this in a consistent fashion, even though Python has > solved the issue of dict.items() vs dict.iteritems() a long time ago. > The application writers should have a choice on how the cursors will > behave. Sure. psycopg 2 uses a little extension to the dbapi and adds to cursor() an extra parameter: "name". If a cursor is named then a server side cursor with that name is automatically generated (and destroyed at the end of the current transaction) else, if name is None, a normal cursor is created. Then fetchXXX() methods do the right thing without the need to introduce extra methods. federico -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux Developer fog at debian.org INIT.D Developer fog at initd.org Sei una bergogna. Vergonga. Vergogna. -- Valentina -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente Url : http://mail.python.org/pipermail/db-sig/attachments/20060906/2fdbf89f/attachment.pgp From gafton at rpath.com Wed Sep 6 01:08:24 2006 From: gafton at rpath.com (Cristian Gafton) Date: Tue, 5 Sep 2006 19:08:24 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <1157495876.3792.16.camel@localhost> References: <1157495876.3792.16.camel@localhost> Message-ID: On Wed, 6 Sep 2006, Federico Di Gregorio wrote: >> - server side cursors. Currently, most bindings for most databases have to >> decide what to do after an cursor.execute() call - do they automatically >> retrieve all the resulting rows in the client's memory, or do they >> retrieve it row by row, pinging the server before every retrieval to get >> more data (hey, not everybody using Oracle ;^). DB API has no support >> for controlling this in a consistent fashion, even though Python has >> solved the issue of dict.items() vs dict.iteritems() a long time ago. >> The application writers should have a choice on how the cursors will >> behave. > > Sure. psycopg 2 uses a little extension to the dbapi and adds to > cursor() an extra parameter: "name". If a cursor is named then a server > side cursor with that name is automatically generated (and destroyed at > the end of the current transaction) else, if name is None, a normal > cursor is created. Then fetchXXX() methods do the right thing without > the need to introduce extra methods. I did not go that rounte because of the potential confusion on named parameters: cu.execute("select * from products " "where pname =:name and store =:store", name = "foo", store = 37) That - to me - feels much more natural, and makes it easy for the execute() method to treat both *args and **kwargs simply as bind parameters. Cristian -- Cristian Gafton rPath, Inc. From gafton at rpath.com Wed Sep 6 01:18:14 2006 From: gafton at rpath.com (Cristian Gafton) Date: Tue, 5 Sep 2006 19:18:14 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <1157495876.3792.16.camel@localhost> Message-ID: On Tue, 5 Sep 2006, Cristian Gafton wrote: >> cursor() an extra parameter: "name". If a cursor is named then a server >> side cursor with that name is automatically generated (and destroyed at > > I did not go that rounte because of the potential confusion on named > parameters: Oops, sorry - you wrote cursor() and I read execute() ;-) Yeah, setting the cursor behavior at its creation time also works. However, asking to name it I think ties the spec a bit too close to the driver details - I can see an app writer wanting somethging like "I don't care, don't suck it all the result sets in the client memory". The itercursor() way has the advantage that can be easily aliased to the standard cursor() method for backends where it doesn't really make sense (sqlite, Oracle, etc). Cristian -- Cristian Gafton rPath, Inc. From mal at egenix.com Wed Sep 6 10:42:33 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 06 Sep 2006 10:42:33 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: Message-ID: <44FE89F9.5000202@egenix.com> Cristian Gafton wrote: > I have seen the recent discussions about the DB API 2.0 shortcomings > when it comes to working with modern databases. I tend to agree that the > current API serves more of a guideline - it specifies the bare minimum, on > top of which everybody keeps reinventing the same extensions. > > So, what I would like to see covered by a future DB API spec: > > - bind parameters. The current pythonesque %(name)s specification is not > ideal. It requires various levels of pain in escaping the arguments passed > to a query. Most database backends accept bind parameters, albeit with > different syntaxes. For code portability, I'd rather parse the SQL query > and rewrite it to use the proper element for the bound parameter than do > the crazy escaping on the user input that is currently being done. > > (As a side note, my bindings use PostgreSQL's native support to do things > like cu.execute("select * from foo where id = $1", x), without having to > worry about escaping x) I'm not sure I understand your comment on escaping things - you normally pass the statement (with the binding parameter markers) and the binding parameters separately to the database. This allows the database to create an query plan for the statement and then apply the parameters to this query plan one or more times. The main benefit is that you don't have to do any escaping in the SQL statement, which as a side-effect, also prevent the typical SQL injection vulnerabilities. > - parsed statements. On large loops this is a real gain. For lack of a > better specification, I currently use something like: > prepcu = db.prepare("select * from foo where id = $1") > prepcu.execute(2) > The prepared cursor statement has the same fetchone(), fetchall() and > other properties of the regular cursors, except it only accepts parameters > to its execute() and executemany() calls. The DB API specifies that the driver should try to cache the prepared statement based on the statement string. Since in some cases, you may need to do the prepare step without actually executing anything, I've added the following in mxODBC which is in line with the DB API spec: cursor.prepare(command) Prepare the statement for execution and set the cursor.command attribute to command. The programmer can then pass cursor.command to the .executexxx() methods, e.g. cursor.execute(cursor.command, params) which the interface will notice and then use the prepared statement. > - server side cursors. Currently, most bindings for most databases have to > decide what to do after an cursor.execute() call - do they automatically > retrieve all the resulting rows in the client's memory, or do they > retrieve it row by row, pinging the server before every retrieval to get > more data (hey, not everybody using Oracle ;^). DB API has no support > for controlling this in a consistent fashion, even though Python has > solved the issue of dict.items() vs dict.iteritems() a long time ago. > The application writers should have a choice on how the cursors will > behave. > > (Again, in my bindings, I have added support for db.iteritems() to get a > cursor that will retrieve rows from the server side in chunks instead of > all at once. I left db.cursor() to return a cursor which will download all > results in the client memory after an execute - which seems to be the > prevailing default) Server side cursors vs. client side cursors is usually something that's implemented and managed by the database driver - why should the Python programmer have to think about this detail ? The Python programmer can use .fetchone() or .fetchmany() to indicate whether she wants to read rows in chunks or one-by-one. The Python interface can then map these requests to whatever the database driver has to offer. Something that's missing from the DB API spec is a way to define the cursor's name. In mxODBC I've added an optional name parameter to connection.cursor([name]) which predefines the name of the cursor. While cursors usually automatically get a name assigned by the database, it is sometimes useful to know this name in advance and then use server side cursors by explicitly coding the SQL statements to refer to the opened cursor, e.g. for updates based on the cursor position. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 06 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From andychambers2002 at yahoo.co.uk Wed Sep 6 10:41:46 2006 From: andychambers2002 at yahoo.co.uk (Andy Chambers) Date: Wed, 6 Sep 2006 09:41:46 +0100 (BST) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: Message-ID: <20060906084146.93454.qmail@web26907.mail.ukl.yahoo.com> > - bind parameters. The current pythonesque %(name)s specification is not > ideal. It requires various levels of pain in escaping the arguments passed > to a query. Most database backends accept bind parameters, albeit with > different syntaxes. For code portability, I'd rather parse the SQL query > and rewrite it to use the proper element for the bound parameter than do > the crazy escaping on the user input that is currently being done. I agree about using the native parameter format where possible. For what its worth, my interpretation of the DBAPI, is that the paramstyles define a "class" of parameter styles rather than a hard and fast rule of what the parameters should be. I believe that the postgres $1, $2 format could be directly used without any query rewriting. It is just one character away from the numeric style. Yes you lose in portability but if you can rewrite the query every time you execute one, then you can rewrite your source once if you have to. Incidentally, the webpy module has an interesting solution to this. In their db wrapper, there is a method aparam(), which returns the paramater style for the db currently being used. If someone wants to write portable code, they could implement this themselves and instead of writing "select * from table where param = $1" ..they write this "select * from table where param = %s" % (aparam(),) Then if you change databases, you only need to redefine aparam(). > - parsed statements. On large loops this is a real gain. For lack of a > better specification, I currently use something like: > prepcu = db.prepare("select * from foo where id = $1") > prepcu.execute(2) > The prepared cursor statement has the same fetchone(), fetchall() and > other properties of the regular cursors, except it only accepts parameters > to its execute() and executemany() calls. Have you seen how much this actually improves performance? I myself tried writing a dbapi that made use of prepared queries but found that there was no improvement over psycopg. As I understand it, what you win from not parsing the query, you lose in sub-optimal execution path for many types of query. This is because in postgres, the planner uses information in the query to decide which type of scan it should use in searching the respective tables. By using PQPrepare, you make the plan without all possible information then keep using that sub-optimal plan Regards, Andy ___________________________________________________________ All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine http://uk.docs.yahoo.com/nowyoucan.html From misa+db-sig at redhat.com Wed Sep 6 17:22:25 2006 From: misa+db-sig at redhat.com (Mihai Ibanescu) Date: Wed, 6 Sep 2006 11:22:25 -0400 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: Message-ID: <20060906152225.GG30293@abulafia.devel.redhat.com> On Tue, Sep 05, 2006 at 05:39:28PM -0400, Cristian Gafton wrote: > > To make a longer story shorter, as I progressed through this task, the > changes I have been making to PyGreSQL became more and more extensive, > modifying the original's behavior in incompatible ways. I didn't solve the > challenge of preserving PyGreSQL's existing behavior, seeing how the > native PostgreSQL bind parameters syntax ($1, $2, etc) is very different > from what PyGreSQL simulated (python-style %s or %(var)s). (although it > would be possible to do with some regexp foo) Or even better, with my RiggedDict class :-) class RiggedDict(dict): """Dictionary that will always return an object generated by a counter""" def __init__(self): dict.__init__(self) self._counter = 0 self._map = {} def __getitem__(self, k): if k in self._map: return self._map[k] # Generate a value for this key, store it and return it self._counter += 1 val = self._map[k] = '$%d' % self._counter return val def get_map(self): # utility function to check the maps that were generated so far return self._map s = "select %(foo1)s, %(foo2)s, %(foo1)s from dual" d = RiggedDict() print s % d Misa From gafton at rpath.com Wed Sep 6 17:53:31 2006 From: gafton at rpath.com (Cristian Gafton) Date: Wed, 6 Sep 2006 11:53:31 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <44FE89F9.5000202@egenix.com> References: <44FE89F9.5000202@egenix.com> Message-ID: On Wed, 6 Sep 2006, M.-A. Lemburg wrote: > I'm not sure I understand your comment on escaping things - you > normally pass the statement (with the binding parameter markers) > and the binding parameters separately to the database. That's not how the MySQL and PostgreSQL bindings I looked at work. Given a cursor.execute(query, args), they go through various pains to escape the args tuple, then simply do "query % args" and pass the resulting string as a single query without parameters to the backends. > This allows the database to create an query plan for the statement > and then apply the parameters to this query plan one or more times. > > The main benefit is that you don't have to do any escaping in the > SQL statement, which as a side-effect, also prevent the typical > SQL injection vulnerabilities. > Server side cursors vs. client side cursors is usually something > that's implemented and managed by the database driver - why should > the Python programmer have to think about this detail ? Because it is only the programmer that knows "I am expecting 1 million rows out of this query, you'd better now load it all up in RAM at once" > The Python programmer can use .fetchone() or .fetchmany() to > indicate whether she wants to read rows in chunks or one-by-one. > The Python interface can then map these requests to whatever > the database driver has to offer. Not all database drivers are rich enough, or smart enough, or sufficiently envolved (MySQL and PostgreSQL are such examples); you either retrieve all results at once at a cost of client memory or you retrieve in chunks using FETCH at the cost of speed. Again, it is the application programmer that knows which is appropiate for which case. Cristian -- Cristian Gafton rPath, Inc. From gafton at rpath.com Wed Sep 6 18:09:20 2006 From: gafton at rpath.com (Cristian Gafton) Date: Wed, 6 Sep 2006 12:09:20 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <20060906084146.93454.qmail@web26907.mail.ukl.yahoo.com> References: <20060906084146.93454.qmail@web26907.mail.ukl.yahoo.com> Message-ID: On Wed, 6 Sep 2006, Andy Chambers wrote: > could implement this themselves and instead of writing > > "select * > from table > where param = $1" > > ..they write this > > "select * > from table > where param = %s" % (aparam(),) > > Then if you change databases, you only need to redefine aparam(). That's not gonna help much for backends that like named bind parameters, like Oracle... But you are right, the current effort spend nowadays to escape the arguments and not use bind parameters could be spent rewriting the query string to the bind parameter format required by the backend. > Have you seen how much this actually improves performance? I myself tried > writing a dbapi that made use of prepared queries but found that there > was no improvement over psycopg. For Postgres, using a prepared statement on a call like executemany() gives you roughly 2-2.5x times faster execution for simple queries. Probably not much, but for other backends it is more dramatic. In MySQL it comes down to ~20x; In Oracle that's even sweeter, because Oracle caches the execution plans of the prepared statements and looks them up whenever you "prepare" them again in its internal cache, with very dramatic effects on the execution speed. > As I understand it, what you win from not parsing the query, you lose > in sub-optimal execution path for many types of query. This is because > in postgres, the planner uses information in the query to decide which type > of scan it should use in searching the respective tables. By using > PQPrepare, you make the plan without all possible information then keep using > that sub-optimal plan That might be true and it is a limitation of PostgreSQL; however, in my experience, most prepared statements tend to be quite simple inserts or straight join selects (probably with the exception of Oracle, where you will have a DBA jumping down your throat for not preparing everything and messing up his database's statement plan cache faster than you can say "oops"). I think what you are saying might be a reason not to use it in certain cases with PostgreSQL, not a reason for the DB API not to define it in reasonable way. Cristian -- Cristian Gafton rPath, Inc. From mal at egenix.com Wed Sep 6 20:30:34 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 06 Sep 2006 20:30:34 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <44FE89F9.5000202@egenix.com> Message-ID: <44FF13CA.8070806@egenix.com> Cristian Gafton wrote: > On Wed, 6 Sep 2006, M.-A. Lemburg wrote: > >> I'm not sure I understand your comment on escaping things - you >> normally pass the statement (with the binding parameter markers) >> and the binding parameters separately to the database. > > That's not how the MySQL and PostgreSQL bindings I looked at work. Given a > cursor.execute(query, args), they go through various pains to escape the > args tuple, then simply do "query % args" and pass the resulting string as > a single query without parameters to the backends. If the database drivers don't provide a mechanism to pass in statements and parameters separately, that's a possible way to implement bound parameters. It's not efficient, though, since the database will have to parse the complete SQL statement for each row of data you pass to .executexxx(). >> This allows the database to create an query plan for the statement >> and then apply the parameters to this query plan one or more times. >> >> The main benefit is that you don't have to do any escaping in the >> SQL statement, which as a side-effect, also prevent the typical >> SQL injection vulnerabilities. > >> Server side cursors vs. client side cursors is usually something >> that's implemented and managed by the database driver - why should >> the Python programmer have to think about this detail ? > > Because it is only the programmer that knows "I am expecting 1 million > rows out of this query, you'd better now load it all up in RAM at once" Right, but in that case, the programmer would just do a .fetchall(), so the interface can infer this from the type of .fetchxxx() method. >> The Python programmer can use .fetchone() or .fetchmany() to >> indicate whether she wants to read rows in chunks or one-by-one. >> The Python interface can then map these requests to whatever >> the database driver has to offer. > > Not all database drivers are rich enough, or smart enough, or sufficiently > envolved (MySQL and PostgreSQL are such examples); you either retrieve all > results at once at a cost of client memory or you retrieve in chunks using > FETCH at the cost of speed. Again, it is the application programmer that > knows which is appropiate for which case. Maybe I'm missing something, but doesn't the programmer let the database module know by using either .fetchmany() or .fetchall() ?! Database drivers normally do not fetch any rows from a result set until you actually make a call to do so. In some cases, they don't even execute the SQL statement until you do. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 06 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From gafton at rpath.com Wed Sep 6 20:47:17 2006 From: gafton at rpath.com (Cristian Gafton) Date: Wed, 6 Sep 2006 14:47:17 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <44FF13CA.8070806@egenix.com> References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> Message-ID: On Wed, 6 Sep 2006, M.-A. Lemburg wrote: > If the database drivers don't provide a mechanism to pass in > statements and parameters separately, that's a possible way to > implement bound parameters. But the database drivers in most cases *do* provide such a mechanism. I tend to blame the DB API's lack of clear specification on how to handle bind parameters that has made some take the "easy" way out. >> Because it is only the programmer that knows "I am expecting 1 million >> rows out of this query, you'd better now load it all up in RAM at once" > > Right, but in that case, the programmer would just do a .fetchall(), > so the interface can infer this from the type of .fetchxxx() method. You're missing the point. Usually a programmer does something like: cursor.execute(...) cursor.fetchXXX() The problem is that in some cases the entire result set is downloaded in the client RAM before returning from the *execute* call. the fetchXXX calls come too late to infer anything. You have to know before the execute if you want to open up a server side cursor or you want the execute call to return and malloc a whole bunch of memory on your local stack. >> Not all database drivers are rich enough, or smart enough, or sufficiently >> envolved (MySQL and PostgreSQL are such examples); you either retrieve all >> results at once at a cost of client memory or you retrieve in chunks using >> FETCH at the cost of speed. Again, it is the application programmer that >> knows which is appropiate for which case. > > Maybe I'm missing something, but doesn't the programmer let the > database module know by using either .fetchmany() or > .fetchall() ?! It doesn't. The C level APIs of the databases are written in such a way that at the end of the low level on-the-wire execute() call you are making you get returned the entire result set. There is nothing fetchXXX can do to help you there. > Database drivers normally do not fetch any rows from a result set > until you actually make a call to do so. In some cases, they don't > even execute the SQL statement until you do. You've probably been spoiled by Oracle... Cristian -- Cristian Gafton rPath, Inc. From mal at egenix.com Wed Sep 6 22:30:17 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 06 Sep 2006 22:30:17 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> Message-ID: <44FF2FD9.8080807@egenix.com> Cristian Gafton wrote: > On Wed, 6 Sep 2006, M.-A. Lemburg wrote: > >> If the database drivers don't provide a mechanism to pass in >> statements and parameters separately, that's a possible way to >> implement bound parameters. > > But the database drivers in most cases *do* provide such a mechanism. I > tend to blame the DB API's lack of clear specification on how to handle > bind parameters that has made some take the "easy" way out. I suppose that the authors who did had good reasons in doing so. mxODBC certainly doesn't use that approach, but some ODBC drivers do because the wire protocols don't provide a way to separate the statement from the parameters. >>> Because it is only the programmer that knows "I am expecting 1 million >>> rows out of this query, you'd better now load it all up in RAM at once" >> Right, but in that case, the programmer would just do a .fetchall(), >> so the interface can infer this from the type of .fetchxxx() method. > > You're missing the point. Usually a programmer does something like: > cursor.execute(...) > cursor.fetchXXX() > > The problem is that in some cases the entire result set is downloaded in > the client RAM before returning from the *execute* call. the fetchXXX > calls come too late to infer anything. You have to know before the execute > if you want to open up a server side cursor or you want the execute call > to return and malloc a whole bunch of memory on your local stack. Sounds rather specific to a certain database backend. Most databases we work with (MS SQL Server, Oracle, SAP/Max DB, Sybase, DB2 to name a few) tend to postpone execution and sending of the results until the very last moment. In database applications you rarely want huge result sets in one go. You typically try to read in the data in chunks where the actual fetching of the chunks is done using multiple SQL statements, keeping the complete result set on the server side and only transferring the data you need. >>> Not all database drivers are rich enough, or smart enough, or sufficiently >>> envolved (MySQL and PostgreSQL are such examples); you either retrieve all >>> results at once at a cost of client memory or you retrieve in chunks using >>> FETCH at the cost of speed. Again, it is the application programmer that >>> knows which is appropiate for which case. >> Maybe I'm missing something, but doesn't the programmer let the >> database module know by using either .fetchmany() or >> .fetchall() ?! > > It doesn't. The C level APIs of the databases are written in such a way > that at the end of the low level on-the-wire execute() call you are making > you get returned the entire result set. There is nothing fetchXXX can do > to help you there. If that's the case for PostgreSQL, perhaps you need to add a non-standard method to choose the fetch strategy before doing the .executexxx() ?! >> Database drivers normally do not fetch any rows from a result set >> until you actually make a call to do so. In some cases, they don't >> even execute the SQL statement until you do. > > You've probably been spoiled by Oracle... Not really :-) I've worked with many ODBC drivers during the last few years and some have exhibited really funny behavior, e.g. I remember MS SQL Server once complaining about a syntax error when querying the number of rows in the result set - not during the prepare step of the statement where you would normally expect this to happen. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 06 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nasr974 at hotmail.com Wed Sep 6 23:15:02 2006 From: nasr974 at hotmail.com (Nasr Y.M.J.O.) Date: Wed, 06 Sep 2006 21:15:02 +0000 Subject: [DB-SIG] Python-db select Message-ID: Dear All, Im new to this list. Im trying to run a python script (Linux 2.6.9) in order to configure a cluster filesystem. When I ran the script I had the following error: ------------------------------------------------------------------------------------------------------------------------------------- Traceback (most recent call last): File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 318, in ? main() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 315, in main app.run() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 306, in run self.config_lustre() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 229, in config_lustre osslist = self.get_OSS() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 99, in get_OSS query = 'select nodes.id, nodes.name from nodes where '\ TypeError: int argument required ---------------------------------------------------------------------------------------------------------------------------------------- Here are the lines: def get_OSS(self): #grabs the OSS servers from the database line 99 --------> query = 'select nodes.id, nodes.name from nodes where '\ 'nodes.membership=%d ' \ 'or nodes.membership=%d' \ % (self.memberships['Compute-oss'], self.memberships['Lustre-oss']) osslist = [] rc = self.rocksdb.execute(query) if rc: for row in self.rocksdb.fetchall(): nodeid = int(row[0]) nodename = row[1] osslist.append((nodeid,nodename)) return osslist ---------------------------------------------------------------------------------------------------------------------------------------- Im using MYSQL 4.1.12.3 and Python 2.3.4. Any suggestions of how to solve this is greatly appreciated. Many thanks, -nasr _________________________________________________________________ Find love online with MSN Personals. http://match.msn.com.my/match/mt.cfm?pg=channel From nasr974 at hotmail.com Wed Sep 6 23:16:57 2006 From: nasr974 at hotmail.com (Nasr Y.M.J.O.) Date: Wed, 06 Sep 2006 21:16:57 +0000 Subject: [DB-SIG] Python-db select Message-ID: Dear All, Im new to this list. Im trying to run a python script (Linux 2.6.9) in order to configure a cluster filesystem. When I ran the script I had the following error: ------------------------------------------------------------------------------------------------------------------------------------- Traceback (most recent call last): File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 318, in ? main() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 315, in main app.run() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 306, in run self.config_lustre() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 229, in config_lustre osslist = self.get_OSS() File "/opt/rocks-lustre/bin/rocksfs-config-lustre.py", line 99, in get_OSS query = 'select nodes.id, nodes.name from nodes where '\ TypeError: int argument required ---------------------------------------------------------------------------------------------------------------------------------------- Here are the lines: def get_OSS(self): #grabs the OSS servers from the database line 99 --------> query = 'select nodes.id, nodes.name from nodes where '\ 'nodes.membership=%d ' \ 'or nodes.membership=%d' \ % (self.memberships['Compute-oss'], self.memberships['Lustre-oss']) osslist = [] rc = self.rocksdb.execute(query) if rc: for row in self.rocksdb.fetchall(): nodeid = int(row[0]) nodename = row[1] osslist.append((nodeid,nodename)) return osslist ---------------------------------------------------------------------------------------------------------------------------------------- Im using MYSQL 4.1.12.3 and Python 2.3.4. Any suggestions of how to solve this is greatly appreciated. Many thanks, -nasr _________________________________________________________________ Get an advanced look at the new version of MSN Messenger. http://messenger.msn.com.my/Beta/Default.aspx From gafton at rpath.com Thu Sep 7 00:02:16 2006 From: gafton at rpath.com (Cristian Gafton) Date: Wed, 6 Sep 2006 18:02:16 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <44FF2FD9.8080807@egenix.com> References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> <44FF2FD9.8080807@egenix.com> Message-ID: On Wed, 6 Sep 2006, M.-A. Lemburg wrote: >> But the database drivers in most cases *do* provide such a mechanism. I >> tend to blame the DB API's lack of clear specification on how to handle >> bind parameters that has made some take the "easy" way out. > > I suppose that the authors who did had good reasons in doing so. As I have just mentioned at the beginning of this thread, I started off with PyGreSQL, which does not support bind parameters and changed it. I assume in this particular case it's because of historical reasons - older versions of PostgreSQL did not have support for bind parameters. I can not figure out a single good reason why currently none of the MySQL bindings support bind paramaters and rely instead on awful parameter escaping techniques. > Sounds rather specific to a certain database backend. Most databases > we work with (MS SQL Server, Oracle, SAP/Max DB, Sybase, DB2 to name > a few) tend to postpone execution and sending of the results until > the very last moment. MySQL can do that too, but then the query takes over the connection - you can not run other queries while you retrieve the result set using the so-called "use result" strategy. Which is why in most cases probably it is safer to download everything in one shot and free up the connection for other work as soon as possible. > In database applications you rarely want huge result sets > in one go. You typically try to read in the data in chunks where the > actual fetching of the chunks is done using multiple SQL statements, That's my point exactly - the only one that knows what to expect back from the backend is the application writer, because on the more popular databases (like MySQL and PostgreSQL), fetching results in chunks adds a sizeable cost in extra rountrips and speed of retrieving the results. >>> Maybe I'm missing something, but doesn't the programmer let the >>> database module know by using either .fetchmany() or >>> .fetchall() ?! >> >> It doesn't. The C level APIs of the databases are written in such a way >> that at the end of the low level on-the-wire execute() call you are making >> you get returned the entire result set. There is nothing fetchXXX can do >> to help you there. > > If that's the case for PostgreSQL, perhaps you need to add a > non-standard method to choose the fetch strategy before doing > the .executexxx() ?! Okay, that's what I was proposing with cursor = db.itercursor() which would set up the cursor to iterate through the results, in a similar fashion to what dict.iteritems() does compared to dict.items(). I take it you agree with that approach then? Cristian -- Cristian Gafton rPath, Inc. From kev at drule.org Wed Sep 6 23:50:45 2006 From: kev at drule.org (Kevin) Date: Wed, 6 Sep 2006 15:50:45 -0600 (MDT) Subject: [DB-SIG] Python-db select In-Reply-To: References: Message-ID: <40436.209.244.4.106.1157579445.squirrel@www.drule.org> > Dear All, > > Im new to this list. Im trying to run a python script (Linux 2.6.9) in > order > to configure a cluster filesystem. When I ran the script I had the > following > error: One of the following variables is not an integer. self.memberships['Compute-oss'] self.memberships['Lustre-oss'] This really isn't the right forum for this question. Try here: http://www.rocksclusters.org/wordpress/?page_id=6 -- Kevin From farcepest at gmail.com Thu Sep 7 16:07:42 2006 From: farcepest at gmail.com (Andy Dustman) Date: Thu, 7 Sep 2006 10:07:42 -0400 Subject: [DB-SIG] Python-db select In-Reply-To: References: Message-ID: <9826f3800609070707j77d37686v6d688c2938a051a5@mail.gmail.com> On 9/6/06, Nasr Y.M.J.O. wrote: > def get_OSS(self): > #grabs the OSS servers from the database > line 99 --------> query = 'select nodes.id, nodes.name from nodes where '\ > 'nodes.membership=%d ' \ > 'or nodes.membership=%d' \ > % (self.memberships['Compute-oss'], > self.memberships['Lustre-oss']) > > osslist = [] > rc = self.rocksdb.execute(query) Don't do that; you should only use %s placeholders with MySQLdb, and you aren't passing your parameters to execute().. Please read the documentation, and particularly PEP-249. This is what you should be doing: On 9/6/06, Nasr Y.M.J.O. wrote: def get_OSS(self): #grabs the OSS servers from the database query = """select nodes.id, nodes.name from nodes where nodes.membership=%s or nodes.membership=%s""" parameters = (self.memberships['Compute-oss'], self.memberships['Lustre-oss']) osslist = [] rc = self.rocksdb.execute(query, parameters) -- This message has been scanned for memes and dangerous content by MindScanner, and is believed to be unclean. From farcepest at gmail.com Thu Sep 7 16:12:59 2006 From: farcepest at gmail.com (Andy Dustman) Date: Thu, 7 Sep 2006 10:12:59 -0400 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> <44FF2FD9.8080807@egenix.com> Message-ID: <9826f3800609070712u2bf66309y5cffe32bc14e301@mail.gmail.com> On 9/6/06, Cristian Gafton wrote: > I can not figure out a single good reason why currently none of the MySQL > bindings support bind paramaters and rely instead on awful parameter > escaping techniques. Time. Effort. That sort of thing. Parameter binding for MySQLdb is in the works for 2.0, and I may actually have a co-developer to work on it. However, not all MySQL SQL statements can be used with the prepared statements API, or so says the documentation, which complicates things, so in some cases it is necessary to fall back to doing parameter substitution on the client side. -- This message has been scanned for memes and dangerous content by MindScanner, and is believed to be unclean. From carsten at uniqsys.com Thu Sep 7 16:34:11 2006 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 07 Sep 2006 10:34:11 -0400 Subject: [DB-SIG] Python-db select In-Reply-To: <9826f3800609070707j77d37686v6d688c2938a051a5@mail.gmail.com> References: <9826f3800609070707j77d37686v6d688c2938a051a5@mail.gmail.com> Message-ID: <1157639651.11918.21.camel@dot.uniqsys.com> On Thu, 2006-09-07 at 10:07, Andy Dustman wrote: > On 9/6/06, Nasr Y.M.J.O. wrote: > > > def get_OSS(self): > > #grabs the OSS servers from the database > > line 99 --------> query = 'select nodes.id, nodes.name from nodes where '\ > > 'nodes.membership=%d ' \ > > 'or nodes.membership=%d' \ > > % (self.memberships['Compute-oss'], > > self.memberships['Lustre-oss']) > > > > osslist = [] > > rc = self.rocksdb.execute(query) > > Don't do that; you should only use %s placeholders with MySQLdb, and > you aren't passing your parameters to execute().. Please read the > documentation, and particularly PEP-249. Right speech, wrong audience. I gather from the context of the question that Nasr didn't write this script, he's just a user. -Carsten From gafton at rpath.com Thu Sep 7 19:01:02 2006 From: gafton at rpath.com (Cristian Gafton) Date: Thu, 7 Sep 2006 13:01:02 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <9826f3800609070712u2bf66309y5cffe32bc14e301@mail.gmail.com> References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> <44FF2FD9.8080807@egenix.com> <9826f3800609070712u2bf66309y5cffe32bc14e301@mail.gmail.com> Message-ID: On Thu, 7 Sep 2006, Andy Dustman wrote: > Parameter binding for MySQLdb is in the works for 2.0, and I may > actually have a co-developer to work on it. However, not all MySQL SQL > statements can be used with the prepared statements API, or so says > the documentation, which complicates things, so in some cases it is > necessary to fall back to doing parameter substitution on the client > side. Yeah, that's a real bitch. Looks like in MySQL most of the DDL statements can not be sent to the server with bind parameters. That being said, some simple .startswith() tests on the Python side can sort out the DML statements - most exposed to SQL injection attacks and make those use bind params. Cristian -- Cristian Gafton rPath, Inc. From mal at egenix.com Thu Sep 7 19:05:28 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 07 Sep 2006 19:05:28 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> <44FF2FD9.8080807@egenix.com> Message-ID: <45005158.6030707@egenix.com> Cristian Gafton wrote: >> In database applications you rarely want huge result sets >> in one go. You typically try to read in the data in chunks where the >> actual fetching of the chunks is done using multiple SQL statements, > > That's my point exactly - the only one that knows what to expect back from > the backend is the application writer, because on the more popular > databases (like MySQL and PostgreSQL), fetching results in chunks adds a > sizeable cost in extra rountrips and speed of retrieving the results. Right, but this isn't something for the DB API to define. You have to use the SQL of a particular database backend and its features (such as server side cursors, ability to limit/offset the query result set, etc.). A module author can make things a little easier for the programmer by providing this functionality via the cursor.scroll() method, e.g. cursor.execute('select * from mytable') cursor.scroll(9999) rs = cursor.fetchmany(100) to fetch 100 rows at offset 9999 of the result set. >>>> Maybe I'm missing something, but doesn't the programmer let the >>>> database module know by using either .fetchmany() or >>>> .fetchall() ?! >>> It doesn't. The C level APIs of the databases are written in such a way >>> that at the end of the low level on-the-wire execute() call you are making >>> you get returned the entire result set. There is nothing fetchXXX can do >>> to help you there. >> If that's the case for PostgreSQL, perhaps you need to add a >> non-standard method to choose the fetch strategy before doing >> the .executexxx() ?! > > Okay, that's what I was proposing with > cursor = db.itercursor() > > which would set up the cursor to iterate through the results, in a similar > fashion to what dict.iteritems() does compared to dict.items(). I take it > you agree with that approach then? It would be better to define the fetching strategy on a regular cursor object, e.g. cursor.setprefetchsize(1024) to have the database module prepare fetches of 1024 rows or cursor.setprefetchsize(sys.maxint) to always read the whole result set. You could also use the cursor.arraysize attribute on cursors as indicator of how many rows to pre-fetch. cursor.arraysize defines the default number of rows to fetch using cursor.fetchmany(). Aside: cursors can optionally implement the iteration protocol, so you can write: for row in cursor: print row -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 07 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From gafton at rpath.com Thu Sep 7 19:55:33 2006 From: gafton at rpath.com (Cristian Gafton) Date: Thu, 7 Sep 2006 13:55:33 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <45005158.6030707@egenix.com> References: <44FE89F9.5000202@egenix.com> <44FF13CA.8070806@egenix.com> <44FF2FD9.8080807@egenix.com> <45005158.6030707@egenix.com> Message-ID: On Thu, 7 Sep 2006, M.-A. Lemburg wrote: > Right, but this isn't something for the DB API to define. You > have to use the SQL of a particular database backend and its > features (such as server side cursors, ability to limit/offset > the query result set, etc.). > > A module author can make things a little easier for the programmer > by providing this functionality via the cursor.scroll() > method, e.g. In most cases, opening up a server side cursor means rewriting the user's SQL query (ie, to insert a "DECLARE CURSOR " in front of it) or various other tricks that make cursor.scroll() useless. > It would be better to define the fetching strategy on a regular > cursor object, e.g. cursor.setprefetchsize(1024) to have the > database module prepare fetches of 1024 rows or > cursor.setprefetchsize(sys.maxint) to always read the whole > result set. > > You could also use the cursor.arraysize attribute on cursors as > indicator of how many rows to pre-fetch. cursor.arraysize defines > the default number of rows to fetch using cursor.fetchmany(). I don't like this because it is not only imprecise and contraining, it is confusing as well. If I start a query using arraysize=1, should I infer from that that I would like to retrieve the results one by one and automatically take the speed penalty of a server side cursor? What does it mean when I change the arraysize or call setprefetchsize() in the middle of my cursor.fetchone() loop? Every time we overload the standard cursor() with some more variables and methods, we have to define the behavior of what happens when any combination of those is set by the user. I tend to like simpler objects, with a single and well defined behavior that do not surprize the user with "magic". In the itercursor() case, an iteration cursor attempts to extract data from the server in chunks and not load up the entire result set in RAM. It's simple, straightforward, you know what you get and at what cost. > Aside: cursors can optionally implement the iteration > protocol, so you can write: > > for row in cursor: > print row Now you're asking the Cursor class to know at the instantiation or execute() time how one is gonna loop over the results?! Remember, by the time you hit fetchXXX() stage, it is too late to change your mind in this server-side or not business. Cristian -- Cristian Gafton rPath, Inc. From ricardo.b at zmail.pt Thu Sep 7 22:08:27 2006 From: ricardo.b at zmail.pt (Ricardo Bugalho) Date: Thu, 07 Sep 2006 21:08:27 +0100 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: Message-ID: <1157659707.24308.88.camel@ezquiel> Hello, I think you're mixing up DB-API specs with implementation. Mostly, the beneficts of what you pointed out could be implemented by smarter bindings, without changing the DP-API spec. And some already do. On Tue, 2006-09-05 at 17:39 -0400, Cristian Gafton wrote: > > - bind parameters. The current pythonesque %(name)s specification is not > ideal. It requires various levels of pain in escaping the arguments passed > to a query. Most database backends accept bind parameters, albeit with > different syntaxes. For code portability, I'd rather parse the SQL query > and rewrite it to use the proper element for the bound parameter than do > the crazy escaping on the user input that is currently being done. You don't need to change the DB-API for this. Your binding could take pythonic parametrized queries and coverts them into PostgreSQL style queries with a bit of string processing. For example cursor.execute("SELECT * FROM table WHERE foo = %(foo)s AND bar = %(bar)s", locals()) could be executed as something like result = PQexecParams(...,"SELECT * FROM table WHERE foo = $1 and bar = $2", ...) > - parsed statements. On large loops this is a real gain. For lack of a > better specification, I currently use something like: > prepcu = db.prepare("select * from foo where id = $1") > prepcu.execute(2) > The prepared cursor statement has the same fetchone(), fetchall() and > other properties of the regular cursors, except it only accepts parameters > to its execute() and executemany() calls. Explicit prepared statments should be a good addition for the DB-API. But you can also take advantage of prepared statments by caching query requests. In a simple way, you could cache the last statement executed in that session. Thus, you could implement cursor.execute("SELECT * FROM foo WHERE id = %(id)s", locals()) as something like // currentStatment = "SELECT FROM foo WHERE id = $1" if (strcmp(lastStatment, currentStatement) != 0) then { lastStatement = currentStatement preparedStatment = PQprepare(..., " ", currentStatment, ...); } result = PQexecPrepared(..., " ", ...); Or you can aim for a more complex caching strategy. > - server side cursors. Currently, most bindings for most databases have to > decide what to do after an cursor.execute() call - do they automatically > retrieve all the resulting rows in the client's memory, or do they > retrieve it row by row, pinging the server before every retrieval to get > more data (hey, not everybody using Oracle ;^). DB API has no support > for controlling this in a consistent fashion, even though Python has > solved the issue of dict.items() vs dict.iteritems() a long time ago. > The application writers should have a choice on how the cursors will > behave. DB-API's cursors have always been supposed to be based server side cursors. That's the whole point about having cursors. If some bindings don't use server side cursors when they're available, it's their own problem. From gafton at rpath.com Fri Sep 8 01:25:50 2006 From: gafton at rpath.com (Cristian Gafton) Date: Thu, 7 Sep 2006 19:25:50 -0400 (EDT) Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: <1157659707.24308.88.camel@ezquiel> References: <1157659707.24308.88.camel@ezquiel> Message-ID: On Thu, 7 Sep 2006, Ricardo Bugalho wrote: > I think you're mixing up DB-API specs with implementation. > Mostly, the beneficts of what you pointed out could be implemented by > smarter bindings, without changing the DP-API spec. And some already do. In a certain way, you are right. My point is, the very fact that we have such varied assumptions and implementations of "DB API 2.0" bindings, points to the fact that the spec as a whole needs refining and it needs to be more precise. >> - parsed statements. On large loops this is a real gain. > > Explicit prepared statments should be a good addition for the DB-API. > But you can also take advantage of prepared statments by caching query > requests. > In a simple way, you could cache the last statement executed in that > session. Thus, you could implement > cursor.execute("SELECT * FROM foo WHERE id = %(id)s", locals()) > as something like > // currentStatment = "SELECT FROM foo WHERE id = $1" > if (strcmp(lastStatment, currentStatement) != 0) then { > lastStatement = currentStatement > preparedStatment = PQprepare(..., " ", currentStatment, ...); > } > result = PQexecPrepared(..., " ", ...); As somebody else already pointed out, blindly doing this automatically for some backends can result in performance losses. Of course, in the end it is just a matter of coding; I don't find it particularly appealing because the bindings author is required to implement some magic behavior which can hurt the application developer. We don't need more assumed semantics when it would be fairly easy to provide the application writer with the control he needs over the behavior of the bindings he's using. >> - server side cursors. Currently, most bindings for most databases have to >> decide what to do after an cursor.execute() call - do they automatically >> retrieve all the resulting rows in the client's memory, or do they >> retrieve it row by row, pinging the server before every retrieval > > DB-API's cursors have always been supposed to be based server side > cursors. That's the whole point about having cursors. > If some bindings don't use server side cursors when they're available, > it's their own problem. Well, then it should be spelled out in that fashion. Current wording in the specification leaves things kind of unclear. I don't believe the driver authors are to blame - just take a look around at any bindings for any backend and count the number of extensions people felt compelled to add in. The desire to provide a richer functionality and more control to the application developer is certainly there. The problem is that DB API leaves too many questions unanswered and skips many implementation details and guidelines. Cristian -- Cristian Gafton rPath, Inc. From carsten at uniqsys.com Fri Sep 8 14:27:40 2006 From: carsten at uniqsys.com (Carsten Haese) Date: Fri, 08 Sep 2006 08:27:40 -0400 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <1157659707.24308.88.camel@ezquiel> Message-ID: <1157718460.17144.9.camel@dot.uniqsys.com> On Thu, 2006-09-07 at 19:25, Cristian Gafton wrote: > On Thu, 7 Sep 2006, Ricardo Bugalho wrote: > > DB-API's cursors have always been supposed to be based server side > > cursors. That's the whole point about having cursors. > > If some bindings don't use server side cursors when they're available, > > it's their own problem. > > Well, then it should be spelled out in that fashion. http://www.python.org/dev/peps/pep-0249/ says: """ .cursor() Return a new Cursor Object using the connection. If the database does not provide a direct cursor concept, the module will have to emulate cursors using other means to the extent needed by this specification. """ The fact that server-side cursors should be used whenever possible is not stated in so many words, but it is very much implied in the above. -Carsten From mal at egenix.com Fri Sep 8 15:33:58 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 08 Sep 2006 15:33:58 +0200 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <1157659707.24308.88.camel@ezquiel> Message-ID: <45017146.3040702@egenix.com> Cristian Gafton wrote: > > > I don't believe the > driver authors are to blame - just take a look around at any bindings for > any backend and count the number of extensions people felt compelled to > add in. The desire to provide a richer functionality and more control to > the application developer is certainly there. The problem is that DB API > leaves too many questions unanswered and skips many implementation details > and guidelines. That's a misunderstanding on your part, I believe. The DB-API was designed to provide a specification for an interface in such a way that it is possible to write database modules for a variety of backends, including more and less capable ones. In doing so, the spec deliberately leaves certain details unspecified and only requires certain interfaces to be present and having semantics of a predefined nature. The general principle is keeping the DB-API spec simple while leaving enough room for module authors to extend the interface in ways which may be appropriate for the chosen database backend. Over time, some of these extensions have made it into the DB-API spec as standardized optional extensions (see PEP 249) which will then make their way into DB-API 3.0. If you have questions regarding the DB-API spec, just ask here and we'll try to help. This may even result in new standard extensions to the DB-API. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 08 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ricardo.b at zmail.pt Fri Sep 8 22:18:51 2006 From: ricardo.b at zmail.pt (Ricardo Bugalho) Date: Fri, 08 Sep 2006 21:18:51 +0100 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: Message-ID: <1157746731.24308.122.camel@ezquiel> In reply to Cristan Gaton's post on Fri Sep 8 (which I didn't receive, it seems I'm missing out some db-sig posts) Cristan, when one writes a specification, one should assume that it will be implemented to it's semantical limits and implementators will seek to make it as performant and robust as they see fit and are able to. To the extent that specification requirements aren't a limiting factor to the performance or robusness of implementation, such details are best left off the specification. Those issues are best discussed outside the specification, in places like this mailing list. At least, that is my theory regarding writing specifications. Anyone is welcome to disagree. :) If some bindings' developers didn't take advantage of some backend feature because they mis-interpreted the DB-API spec, then maybe we should go to them and state that it could be better implemented. If they didn't because they didn't have the time or skill to do so, they we should contribute with some code that does. Only in the case that they didn't because the specification makes it unfeasable for them to do so we should look for a new extension that can take advantage of feature X in a feasable way. -- Ricardo From phd at phd.pp.ru Fri Sep 8 21:43:10 2006 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 8 Sep 2006 23:43:10 +0400 Subject: [DB-SIG] ANN: SQLObject 0.7.1rc1 Message-ID: <20060908194310.GC16995@phd.pp.ru> Hello! I'm pleased to announce the 0.7.1rc1 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.1rc1 News and changes: http://sqlobject.org/docs/News.html What's New ========== Features & Interface -------------------- * Added support for psycopg2 and MSSQL. * Added TimeCol. * Implemented RLIKE (regular expression LIKE). Small Features -------------- * Select over RelatedJoin. * SQLite foreign keys. * Postgres DB URIs with a non-default path to unix socket. * Allow the use of foreign keys in selects. * Implemented addColumn() for SQLite. * With PySQLite2 use encode()/decode() from PySQLite1 for BLOBCol if available; else use base64. Bug Fixes --------- * Fixed a longstanding problem with UnicodeCol - at last you can use unicode strings in .select() and .selectBy() queries. There are some limitations, though. * Cull patch (clear cache). * .destroySelf() inside a transaction. * Synchronize main connection cache during transaction commit. * Ordering joins with NULLs. * Fixed bugs with plain/non-plain setters. * Lots of other bug fixes. For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From sanxiyn at gmail.com Mon Sep 11 06:11:01 2006 From: sanxiyn at gmail.com (Sanghyeon Seo) Date: Mon, 11 Sep 2006 13:11:01 +0900 Subject: [DB-SIG] Survey of Python database interfaces Message-ID: <5b0248170609102111v533db1a4q2c4126715650b289@mail.gmail.com> Hello, db-sig, I catalogued Python database interfaces supporting DB-API v2, and I have found 28 of them. The links are collected here: http://sparcs.kaist.ac.kr/~tinuviel/python/database.html The list goes: MySQLdb, psycopg1, psycopg2, pyPgSQL, PyGreSQL, PoPy, PostgresPy, pgasync, bpgsql, pysqlite, cx_Oracle, DCOracle, DCOracle2, mssql, pymssql, PyDB2, KInterbasDB, sybase, sapdb, InformixDB, ingresdbi, pyodbc, mxODBC, ODBTPAPI, adodbapi, Gadfly, SnakeSQL. If there's something missing or wrong, please let me know. -- Seo Sanghyeon From sanxiyn at gmail.com Mon Sep 11 06:12:32 2006 From: sanxiyn at gmail.com (Sanghyeon Seo) Date: Mon, 11 Sep 2006 13:12:32 +0900 Subject: [DB-SIG] Survey of Python database interfaces In-Reply-To: <5b0248170609102111v533db1a4q2c4126715650b289@mail.gmail.com> References: <5b0248170609102111v533db1a4q2c4126715650b289@mail.gmail.com> Message-ID: <5b0248170609102112r24ef2baax83223496dc8113e2@mail.gmail.com> 2006/9/11, Sanghyeon Seo : > I catalogued Python database interfaces supporting DB-API v2, and I > have found 28 of them. The links are collected here: > http://sparcs.kaist.ac.kr/~tinuviel/python/database.html The same information is also available on PythonInfo wiki: I edited the page. http://wiki.python.org/moin/DatabaseInterfaces Seo Sanghyeon From mfrasca at zonnet.nl Mon Sep 11 08:52:51 2006 From: mfrasca at zonnet.nl (Mario Frasca) Date: Mon, 11 Sep 2006 08:52:51 +0200 Subject: [DB-SIG] Survey of Python database interfaces In-Reply-To: <5b0248170609102112r24ef2baax83223496dc8113e2@mail.gmail.com> References: <5b0248170609102111v533db1a4q2c4126715650b289@mail.gmail.com> <5b0248170609102112r24ef2baax83223496dc8113e2@mail.gmail.com> Message-ID: <20060911065250.GA24187@kruiskruid.demon.nl> On 2006-0911 13:12:32, Sanghyeon Seo wrote: > 2006/9/11, Sanghyeon Seo : > > I catalogued Python database interfaces supporting DB-API v2, and I > > have found 28 of them. The links are collected here: > > http://sparcs.kaist.ac.kr/~tinuviel/python/database.html > > The same information is also available on PythonInfo wiki: I edited the page. > http://wiki.python.org/moin/DatabaseInterfaces since you (we) are at it, why not adding some information about the development status of the project? things like " DCOracle2 is currently unmaintained, and no support is available." or "the original psycopg 1.1.x (now obsoleted by psycopg 2)" I would say: two columns for "product status (alfa, beta, stable, obsoleted)" and "project status (active, mantained, unmantained)". regards, thanks, Mario Frasca -- Windows NT encountered the following error: The operation completed successfully. From victory_vasudha at yahoo.co.in Mon Sep 11 10:53:09 2006 From: victory_vasudha at yahoo.co.in (Vasu dha) Date: Mon, 11 Sep 2006 09:53:09 +0100 (BST) Subject: [DB-SIG] reg bind variable does not exist exception Message-ID: <20060911085309.26602.qmail@web8504.mail.in.yahoo.com> Hi when i execute the below code if(userrole_prefix!=null) { int length = userrole_prefix.length; System.out.println("**************start of query"); qry = select s_2090_1_user.currval from dual; logger.info(qry); result=pStmt.executeQuery(qry); System.out.println("**************Query execution"); while(result.next()) { userid = result.getString(1); } System.out.println("User id:1:***********"+userid); qry = insert into ppsr_user_prop_mst (user_prop_mst_id,prop_id,userid) values(user_prop_mst_seq.nextval,?,?); for(int i=0;i Hiya folks, Now that Python 2.5 is offically out and has introduced "with" blocks (http://docs.python.org/whatsnew/pep-343.html), I'm wondering if we (i.e. the module authors) should standardize how DB-API compliant modules leverage this functionality. One obvious possibility is that connection and cursor objects could return themselves in __enter__ and close themselves on __exit__. This would allow the user to write something like this: with module.connect(...) as conn: with conn.cursor() as cur: # ... which is much easier to read than the equivalent: conn = module.connect(...) try: cur = conn.cursor() try: # ... finally: cur.close() finally: conn.close() I've already implemented this behavior for InformixDB in CVS, but if people have better ideas, I'm definitely open to suggestions. -Carsten From mal at egenix.com Mon Sep 25 10:27:16 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 25 Sep 2006 10:27:16 +0200 Subject: [DB-SIG] Standardized "with" block behavior? In-Reply-To: <1158955246.11108.91.camel@dot.uniqsys.com> References: <1158955246.11108.91.camel@dot.uniqsys.com> Message-ID: <451792E4.7020607@egenix.com> Carsten Haese wrote: > Hiya folks, > > Now that Python 2.5 is offically out and has introduced "with" blocks > (http://docs.python.org/whatsnew/pep-343.html), I'm wondering if we > (i.e. the module authors) should standardize how DB-API compliant > modules leverage this functionality. > > One obvious possibility is that connection and cursor objects could > return themselves in __enter__ and close themselves on __exit__. This > would allow the user to write something like this: > > with module.connect(...) as conn: > with conn.cursor() as cur: > # ... > > which is much easier to read than the equivalent: > > conn = module.connect(...) > try: > cur = conn.cursor() > try: > # ... > finally: > cur.close() > finally: > conn.close() > > I've already implemented this behavior for InformixDB in CVS, but if > people have better ideas, I'm definitely open to suggestions. Sounds reasonable, though strictly speaking, the with-container is not necessary, since cursors and connections will close themselves when garbage collected, ie. as soon as they go out of scope. In some cases it may not even be desirable to close the cursor or connection, since this prevents effective debugging (the traceback passed to outer scopes will contain a reference to the cursor and connection, keeping them open). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 25 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mfrasca at zonnet.nl Mon Sep 25 11:57:42 2006 From: mfrasca at zonnet.nl (Mario Frasca) Date: Mon, 25 Sep 2006 11:57:42 +0200 Subject: [DB-SIG] Standardized "with" block behavior? In-Reply-To: <451792E4.7020607@egenix.com> References: <1158955246.11108.91.camel@dot.uniqsys.com> <451792E4.7020607@egenix.com> Message-ID: <20060925095742.GA9777@kruiskruid.demon.nl> On 2006-0925 10:27:16, M.-A. Lemburg wrote: > Sounds reasonable, though strictly speaking, the with-container > is not necessary, since cursors and connections will close themselves > when garbage collected, ie. as soon as they go out of scope. this (garbage collected as soon as they go out of scope) is more implementation dependent than absolutely true... the garbage collector is activated "when it feels like", and will collect all garbage, or maybe just as much garbage as it is given time to..., again depending on the implementation and the situation. I don't know how much determinism is desired, but the with construct can introduce quite a lot of it. > In some cases it may not even be desirable to close the cursor or > connection, since this prevents effective debugging (the traceback > passed to outer scopes will contain a reference to the cursor and > connection, keeping them open). in this case the user would not use the with construct and take care that the connection / cursor do not get garbage collected... as usual, getting lost in details. I really like this new construct and taking advantage of it seems to me the right thing to do. Mario -- "S'il y a un Dieu, l'ath?isme doit lui sembler une moindre injure que la religion." -- Edmond et Jules de Goncourt From mal at egenix.com Mon Sep 25 12:28:16 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 25 Sep 2006 12:28:16 +0200 Subject: [DB-SIG] Standardized "with" block behavior? In-Reply-To: <20060925095742.GA9777@kruiskruid.demon.nl> References: <1158955246.11108.91.camel@dot.uniqsys.com> <451792E4.7020607@egenix.com> <20060925095742.GA9777@kruiskruid.demon.nl> Message-ID: <4517AF40.5070809@egenix.com> Mario Frasca wrote: > On 2006-0925 10:27:16, M.-A. Lemburg wrote: >> Sounds reasonable, though strictly speaking, the with-container >> is not necessary, since cursors and connections will close themselves >> when garbage collected, ie. as soon as they go out of scope. > > this (garbage collected as soon as they go out of scope) is more > implementation dependent than absolutely true... > > the garbage collector is activated "when it feels like", and will > collect all garbage, or maybe just as much garbage as it is given time > to..., again depending on the implementation and the situation. This is only true for Jython and perhaps IronPython (I don't know how objects are managed there). In CPython, an object is GCed as soon as the reference count falls to zero. The Python garbage collector is only needed for situations where you've created circular references keeping a closed set of objects alive. The Python GC is run every now and then or explicitly by calling gc.collect(). > I don't know how much determinism is desired, but the with construct can > introduce quite a lot of it. > >> In some cases it may not even be desirable to close the cursor or >> connection, since this prevents effective debugging (the traceback >> passed to outer scopes will contain a reference to the cursor and >> connection, keeping them open). > > in this case the user would not use the with construct and take care > that the connection / cursor do not get garbage collected... > > as usual, getting lost in details. > > I really like this new construct and taking advantage of it seems to me > the right thing to do. No objections there :-) It certainly saves a few try-finally enclosures. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 25 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From phd at phd.pp.ru Mon Sep 25 20:07:34 2006 From: phd at phd.pp.ru (Oleg Broytmann) Date: Mon, 25 Sep 2006 22:07:34 +0400 Subject: [DB-SIG] SQLObject 0.7.1 final release Message-ID: <20060925180734.GD6330@phd.pp.ru> Hello! I'm pleased to announce the 0.7.1 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.1 News and changes: http://sqlobject.org/docs/News.html What's New ========== Features & Interface -------------------- * Added support for psycopg2 and MSSQL. * Added TimeCol. * RelatedJoin and SQLRelatedJoin objects have a createRelatedTable keyword argument. * Implemented RLIKE (regular expression LIKE). Small Features -------------- * Select over RelatedJoin. * SQLite foreign keys. * Postgres DB URIs with a non-default path to unix socket. * Allow the use of foreign keys in selects. * Implemented addColumn() for SQLite. Bug Fixes --------- * Fixed a longstanding problem with UnicodeCol - at last you can use unicode strings in .select() and .selectBy() queries. There are some limitations, though. * Cull patch (clear cache). * Synchronize main connection cache during transaction commit. * Ordering joins with NULLs. * Fixed bugs with plain/non-plain setters. For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ricardo.b at zmail.pt Mon Sep 25 21:38:54 2006 From: ricardo.b at zmail.pt (Ricardo Bugalho) Date: Mon, 25 Sep 2006 20:38:54 +0100 Subject: [DB-SIG] Standardized "with" block behavior? In-Reply-To: <4517AF40.5070809@egenix.com> References: <1158955246.11108.91.camel@dot.uniqsys.com> <451792E4.7020607@egenix.com> <20060925095742.GA9777@kruiskruid.demon.nl> <4517AF40.5070809@egenix.com> Message-ID: <1159213134.8740.25.camel@ezquiel> On Mon, 2006-09-25 at 12:28 +0200, M.-A. Lemburg wrote: > Mario Frasca wrote: > > On 2006-0925 10:27:16, M.-A. Lemburg wrote: > >> Sounds reasonable, though strictly speaking, the with-container > >> is not necessary, since cursors and connections will close themselves > >> when garbage collected, ie. as soon as they go out of scope. > > > > this (garbage collected as soon as they go out of scope) is more > > implementation dependent than absolutely true... > > > > the garbage collector is activated "when it feels like", and will > > collect all garbage, or maybe just as much garbage as it is given time > > to..., again depending on the implementation and the situation. > > This is only true for Jython and perhaps IronPython (I don't know > how objects are managed there). > > In CPython, an object is GCed as soon as the reference count > falls to zero. That could change the day someone decides to write a CPython with better multi-threaded performanc From ricardo.b at zmail.pt Mon Sep 25 21:49:42 2006 From: ricardo.b at zmail.pt (Ricardo Bugalho) Date: Mon, 25 Sep 2006 20:49:42 +0100 Subject: [DB-SIG] Standardized "with" block behavior? In-Reply-To: <4517AF40.5070809@egenix.com> References: <1158955246.11108.91.camel@dot.uniqsys.com> <451792E4.7020607@egenix.com> <20060925095742.GA9777@kruiskruid.demon.nl> <4517AF40.5070809@egenix.com> Message-ID: <1159213782.8740.37.camel@ezquiel> Sorry about the last post. Clicked on send instead of clicking in another application. On Mon, 2006-09-25 at 12:28 +0200, M.-A. Lemburg wrote: > Mario Frasca wrote: > > On 2006-0925 10:27:16, M.-A. Lemburg wrote: > >> Sounds reasonable, though strictly speaking, the with-container > >> is not necessary, since cursors and connections will close > themselves > >> when garbage collected, ie. as soon as they go out of scope. > > > > this (garbage collected as soon as they go out of scope) is more > > implementation dependent than absolutely true... > > > > the garbage collector is activated "when it feels like", and will > > collect all garbage, or maybe just as much garbage as it is given > time > > to..., again depending on the implementation and the situation. > > This is only true for Jython and perhaps IronPython (I don't know > how objects are managed there). > > In CPython, an object is GCed as soon as the reference count > falls to zero. > > The Python garbage collector is only needed for situations where > you've created circular references keeping a closed set of objects > alive. The Python GC is run every now and then or explicitly > by calling gc.collect(). That could change the day someone decides to re-write CPython with better multi-threaded performance. To be strict, reference counting is a garbage collecting method itself, although it needs to be complemented with a circular references detector. It's also a garbage collecting method that generaly performs poorly in multi-threaded environments, thus we might want to replace it eventually. One should not write code that depends on the garbage collector. Even with reference counting, it's not 100% deterministic since reference counting won't free circular referenced objects. Remeber: garbage collecting is about freeing unused memory, not other resources. From MReddy at Bear.com Tue Sep 26 17:59:58 2006 From: MReddy at Bear.com (Reddy, Murali (Exchange)) Date: Tue, 26 Sep 2006 11:59:58 -0400 Subject: [DB-SIG] SQLObject 0.7.1 final release Message-ID: <62ED5A1ADB86CC4CA3541F70693E7E160CBF10@whexchmb01.bsna.bsroot.bear.com> Hello I am passing a variable to Python script. How ever it does not takes after spaces Example: Test.sh "testing 123" But it takes upto before space as first argument Any body can suggest me what to do ?? *********************************************************************** Bear Stearns is not responsible for any recommendation, solicitation, offer or agreement or any information about any transaction, customer account or account activity contained in this communication. *********************************************************************** From anthony.tuininga at gmail.com Wed Sep 27 01:19:24 2006 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Tue, 26 Sep 2006 17:19:24 -0600 Subject: [DB-SIG] cx_Oracle 4.2.1 Message-ID: <703ae56b0609261619u7f288fcfqa7b51c4d400cd3ac@mail.gmail.com> What is cx_Oracle? cx_Oracle is a Python extension module that allows access to Oracle and conforms to the Python database API 2.0 specifications with a few exceptions. Where do I get it? http://starship.python.net/crew/atuining What's new? 1) Added additional type (NCLOB) to handle CLOBs that use the national character set as requested by Chris Dunscombe. 2) Added support for returning cursors from functions as requested by Daniel Steinmann. 3) Added support for getting/setting the "get" mode on session pools as requested by Anand Aiyer. 4) Added support for binding subclassed cursors. 5) Fixed binding of decimal objects with absolute values less than 0.1. From andy47 at halfcooked.com Wed Sep 27 10:30:17 2006 From: andy47 at halfcooked.com (Andy Todd) Date: Wed, 27 Sep 2006 18:30:17 +1000 Subject: [DB-SIG] New take on PostgreSQL bindings for Python In-Reply-To: References: <20060906084146.93454.qmail@web26907.mail.ukl.yahoo.com> Message-ID: <451A3699.5090906@halfcooked.com> Cristian Gafton wrote: > On Wed, 6 Sep 2006, Andy Chambers wrote: > >> could implement this themselves and instead of writing >> >> "select * >> from table >> where param = $1" >> >> ..they write this >> >> "select * >> from table >> where param = %s" % (aparam(),) >> >> Then if you change databases, you only need to redefine aparam(). > > That's not gonna help much for backends that like named bind parameters, > like Oracle... But you are right, the current effort spend nowadays to > escape the arguments and not use bind parameters could be spent rewriting > the query string to the bind parameter format required by the backend. > Actually, in Oracle (via cx_Oracle) you can use the same parameter name for each placeholder and the values you provide will be assigned in order. e.g. you can use; >>> stmt = "SELECT * FROM table WHERE col1 = :param AND col2 = :param" >>> cursor.execute(stmt, '1', '2') And its quite valid. >> Have you seen how much this actually improves performance? I myself tried >> writing a dbapi that made use of prepared queries but found that there >> was no improvement over psycopg. > > For Postgres, using a prepared statement on a call like executemany() > gives you roughly 2-2.5x times faster execution for simple > queries. Probably not much, but for other backends it is more dramatic. In > MySQL it comes down to ~20x; In Oracle that's even sweeter, because Oracle > caches the execution plans of the prepared statements and looks them up > whenever you "prepare" them again in its internal cache, with very > dramatic effects on the execution speed. > [snip] > > Cristian Err, actually all you save in Oracle with bind parameters is the parse component. This can be of the order of several CPU cycles but doesn't materially affect the execution speed of the fetch part of the statement which is usually much larger. What it *does* do is stop the SGA filling up with multiple variations of the same explain plan (as you indicate). That is what causes DBAs to get militant. I think we can all agree that using bind parameters is a good thing. However, I disagree with your conclusions about how they are implemented by certain DB-API modules. The DB-API doesn't need changing (apart from the removal of the 'format' paramstyle) but some of the modules that implement it do. Regards, Andy -- -------------------------------------------------------------------------------- From the desk of Andrew J Todd esq - http://www.halfcooked.com/