From stephane at is-webdesign.com Wed Aug 1 16:49:02 2007 From: stephane at is-webdesign.com (=?iso-8859-1?q?KLEIN_St=E9phane?=) Date: Wed, 1 Aug 2007 14:49:02 +0000 (UTC) Subject: [DB-SIG] I wonder if there are module, class to perform table copy ? Message-ID: Hello, I wonder if there are module,class to perform table copy ? Coping table structure and/or table data from one database to other or same. Thanks for your answer. Stephane From carsten at uniqsys.com Wed Aug 1 17:27:39 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 01 Aug 2007 11:27:39 -0400 Subject: [DB-SIG] I wonder if there are module, class to perform table copy ? In-Reply-To: References: Message-ID: <1185982059.3370.18.camel@dot.uniqsys.com> On Wed, 2007-08-01 at 14:49 +0000, KLEIN St?phane wrote: > Hello, > > I wonder if there are module,class to perform table copy ? Coping table > structure and/or table data from one database to other or same. You have a much better chance of getting a helpful answer if you told us what your database server is. Oracle, Sql server, Informix, PostgreSQL, MySQL, ...? -- Carsten Haese http://informixdb.sourceforge.net From stephane at is-webdesign.com Wed Aug 1 17:52:19 2007 From: stephane at is-webdesign.com (=?iso-8859-1?q?KLEIN_St=E9phane?=) Date: Wed, 1 Aug 2007 15:52:19 +0000 (UTC) Subject: [DB-SIG] I wonder if there are module, class to perform table copy ? References: <1185982059.3370.18.camel@dot.uniqsys.com> Message-ID: Le Wed, 01 Aug 2007 11:27:39 -0400, Carsten Haese a ?crit?: > On Wed, 2007-08-01 at 14:49 +0000, KLEIN St?phane wrote: >> Hello, >> >> I wonder if there are module,class to perform table copy ? Coping table >> structure and/or table data from one database to other or same. > > You have a much better chance of getting a helpful answer if you told us > what your database server is. Oracle, Sql server, Informix, PostgreSQL, > MySQL, ...? Currently, it's MySQL and PostgreSQL. First, I thought abstract database tools. Stephane From mal at egenix.com Wed Aug 1 19:03:09 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 01 Aug 2007 19:03:09 +0200 Subject: [DB-SIG] I wonder if there are module, class to perform table copy ? In-Reply-To: References: Message-ID: <46B0BCCD.9050602@egenix.com> On 2007-08-01 16:49, KLEIN St?phane wrote: > Hello, > > I wonder if there are module,class to perform table copy ? Coping table > structure and/or table data from one database to other or same. Copying data is usually easy: just do a SELECT * and use the result set as input for the INSERT. Copying table structures is hard, since the database interfaces often don't provide enough information to do exact replicates of the schema. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 01 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From andy47 at halfcooked.com Fri Aug 3 10:15:35 2007 From: andy47 at halfcooked.com (Andy Todd) Date: Fri, 03 Aug 2007 18:15:35 +1000 Subject: [DB-SIG] I wonder if there are module, class to perform table copy ? In-Reply-To: References: <1185982059.3370.18.camel@dot.uniqsys.com> Message-ID: <46B2E427.3050203@halfcooked.com> KLEIN St?phane wrote: > Le Wed, 01 Aug 2007 11:27:39 -0400, Carsten Haese a ?crit : > >> On Wed, 2007-08-01 at 14:49 +0000, KLEIN St?phane wrote: >>> Hello, >>> >>> I wonder if there are module,class to perform table copy ? Coping table >>> structure and/or table data from one database to other or same. >> You have a much better chance of getting a helpful answer if you told us >> what your database server is. Oracle, Sql server, Informix, PostgreSQL, >> MySQL, ...? > > Currently, it's MySQL and PostgreSQL. First, I thought abstract database > tools. > > Stephane > > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > http://mail.python.org/mailman/listinfo/db-sig Most databases that I've used support the statement "CREATE TABLE AS". e.g. take a look at page 890 of the PostgreSQL 8.1 manual. Regards, Andy -- From the desk of Andrew J Todd esq - http://www.halfcooked.com/ From anthony.tuininga at gmail.com Fri Aug 3 16:02:26 2007 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Fri, 3 Aug 2007 08:02:26 -0600 Subject: [DB-SIG] cx_Oracle 4.3.2 Message-ID: <703ae56b0708030702r47e396f8o22c660e32be4e5e2@mail.gmail.com> What is cx_Oracle? cx_Oracle is a Python extension module that allows access to Oracle and conforms to the Python database API 2.0 specifications with a few exceptions. Where do I get it? http://cx-oracle.sourceforge.net NOTE: I have changed providers. Please update any links. What's new? 1) Added methods open(), close(), isopen() and getchunksize() in order to improve performance of reading/writing LOB values in chunks. 2) Fixed support for native doubles and floats in Oracle 10g; added new type NATIVE_FLOAT to allow specification of a variable of that specific type where desired. Thanks to D.R. Boxhoorn for pointing out the fact that this was not working properly when the arraysize was anything other than 1. 3) When calling connection.begin(), only create a new tranasction handle if one is not already associated with the connection. Thanks to Andreas Mock for discovering this and for Amaury Forgeot d'Arc for diagnosing the problem and pointing the way to a solution. 4) Added attribute cursor.rowfactory which allows a method to be called for each row that is returned; this is about 20% faster than calling the method in Python using the idiom [method(*r) for r in cursor]. 5) Attempt to locate an Oracle installation by looking at the PATH if the environment variable ORACLE_HOME is not set; this is of primary use on Windows where this variable should not normally be set. 6) Added support for autocommit mode as requested by Ian Kelly. 7) Added support for connection.stmtcachesize which allows for both reading and writing the size of the statement cache size. This parameter can make a huge difference with the length of time taken to prepare statements. Added support for setting the statement tag when preparing a statement. Both of these were requested by Bjorn Sandberg who also provided an initial patch. 8) When copying the value of a variable, copy the return code as well. Anthony Tuininga From szybalski at gmail.com Fri Aug 3 17:23:12 2007 From: szybalski at gmail.com (Lukasz Szybalski) Date: Fri, 3 Aug 2007 10:23:12 -0500 Subject: [DB-SIG] How to get table names from ODBC? Message-ID: <804e5c70708030823y73bb84a5le295c993150689@mail.gmail.com> Hello, I am using python win32 extentions. I get connected via odbc to a database files that are sitting in the folder. import dbi,odbc db=odbc.odbc('dbfiles') cursor=db.cursor() cursor.execute('select * from tableabcd') print cursor.description This way I am able to find column names. How do i find a available table names? I have 20 tables available to me. How do I list their names?? Lucas From mal at egenix.com Fri Aug 3 18:54:31 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 03 Aug 2007 18:54:31 +0200 Subject: [DB-SIG] How to get table names from ODBC? In-Reply-To: <804e5c70708030823y73bb84a5le295c993150689@mail.gmail.com> References: <804e5c70708030823y73bb84a5le295c993150689@mail.gmail.com> Message-ID: <46B35DC7.8090608@egenix.com> On 2007-08-03 17:23, Lukasz Szybalski wrote: > Hello, > I am using python win32 extentions. > > I get connected via odbc to a database files that are sitting in the folder. > > import dbi,odbc > db=odbc.odbc('dbfiles') > cursor=db.cursor() > cursor.execute('select * from tableabcd') > print cursor.description > > This way I am able to find column names. > How do i find a available table names? I have 20 tables available to > me. How do I list their names?? With the old odbc module this is not possible without knowing the database internals. mxODBC has a .tables() catalog method which makes this kind of introspection easy across all database backends: .tables(qualifier=None, owner=None, table=None, type=None) Catalog method which generates a result set having the following schema... See page 56 in the documentation for details on the schema: http://www.egenix.com/products/python/mxODBC/mxODBC.pdf -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 03 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From szybalski at gmail.com Tue Aug 7 18:58:38 2007 From: szybalski at gmail.com (Lukasz Szybalski) Date: Tue, 7 Aug 2007 11:58:38 -0500 Subject: [DB-SIG] How to escape special field name, mysql? Message-ID: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> Hello, I have installed mysqldb python bindings and I am using python to write to mysql. I have a field called "Desc" in a database (short for description ) But this name is used by mysql for sorting DESC When i do: insert into tablename(id,desc)VALUES(1,'some text') How do I escape 'desc'? Or maybe I do something else? Lucas From carsten at uniqsys.com Tue Aug 7 19:17:14 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 07 Aug 2007 13:17:14 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> Message-ID: <1186507034.3370.31.camel@dot.uniqsys.com> On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > Hello, > I have installed mysqldb python bindings and I am using python to > write to mysql. > > I have a field called "Desc" in a database (short for description ) > But this name is used by mysql for sorting DESC > > When i do: > insert into tablename(id,desc)VALUES(1,'some text') > > How do I escape 'desc'? insert into tablename(id,`desc`) ... HTH, -- Carsten Haese http://informixdb.sourceforge.net From szybalski at gmail.com Tue Aug 7 20:06:40 2007 From: szybalski at gmail.com (Lukasz Szybalski) Date: Tue, 7 Aug 2007 13:06:40 -0500 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186507034.3370.31.camel@dot.uniqsys.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> Message-ID: <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> On 8/7/07, Carsten Haese wrote: > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > Hello, > > I have installed mysqldb python bindings and I am using python to > > write to mysql. > > > > I have a field called "Desc" in a database (short for description ) > > But this name is used by mysql for sorting DESC > > > > When i do: > > insert into tablename(id,desc)VALUES(1,'some text') > > > > How do I escape 'desc'? > > insert into tablename(id,`desc`) ... > >>> conn=MySQLdb.connect( SERVER, USER, PASS, DB ) >>> c=conn.cursor() >>> c.execute("insert into tablename('desc')Values('sss')") Traceback (most recent call last): File "", line 1, in File "c:\Python25\lib\site-packages\MySQLdb\cursors.py", line 166, in execute self.errorhandler(self, exc, value) File "c:\Python25\lib\site-packages\MySQLdb\connections.py", line 35, in defau lterrorhandler raise errorclass, errorvalue _mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax ; check the manual that corresponds to your MySQL server version for the right s yntax to use near ''desc')Values('sss')' at line 1") I was able to escape it on ODBC by doing 'DESC$' but that doesn't work for mysql From carsten at uniqsys.com Tue Aug 7 20:10:53 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 07 Aug 2007 14:10:53 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> Message-ID: <1186510253.3370.35.camel@dot.uniqsys.com> On Tue, 2007-08-07 at 13:06 -0500, Lukasz Szybalski wrote: > On 8/7/07, Carsten Haese wrote: > > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > > Hello, > > > I have installed mysqldb python bindings and I am using python to > > > write to mysql. > > > > > > I have a field called "Desc" in a database (short for description ) > > > But this name is used by mysql for sorting DESC > > > > > > When i do: > > > insert into tablename(id,desc)VALUES(1,'some text') > > > > > > How do I escape 'desc'? > > > > insert into tablename(id,`desc`) ... > > > >>> conn=MySQLdb.connect( SERVER, USER, PASS, DB ) > >>> c=conn.cursor() > >>> c.execute("insert into tablename('desc')Values('sss')") You are quoting the name in apostrophes (ascii character 39). You should be using backwards apostrophes (ascii character 96). -- Carsten Haese http://informixdb.sourceforge.net From szybalski at gmail.com Tue Aug 7 20:40:05 2007 From: szybalski at gmail.com (Lukasz Szybalski) Date: Tue, 7 Aug 2007 13:40:05 -0500 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186510253.3370.35.camel@dot.uniqsys.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> <1186510253.3370.35.camel@dot.uniqsys.com> Message-ID: <804e5c70708071140tf9d02a3o6d571f9b9cf30265@mail.gmail.com> On 8/7/07, Carsten Haese wrote: > On Tue, 2007-08-07 at 13:06 -0500, Lukasz Szybalski wrote: > > On 8/7/07, Carsten Haese wrote: > > > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > > > Hello, > > > > I have installed mysqldb python bindings and I am using python to > > > > write to mysql. > > > > > > > > I have a field called "Desc" in a database (short for description ) > > > > But this name is used by mysql for sorting DESC > > > > > > > > When i do: > > > > insert into tablename(id,desc)VALUES(1,'some text') > > > > > > > > How do I escape 'desc'? > > > > > > insert into tablename(id,`desc`) ... > > > > > >>> conn=MySQLdb.connect( SERVER, USER, PASS, DB ) > > >>> c=conn.cursor() > > >>> c.execute("insert into tablename('desc')Values('sss')") > > You are quoting the name in apostrophes (ascii character 39). You should > be using backwards apostrophes (ascii character 96). Thank you, That worked. Lucas From paul at boddie.org.uk Wed Aug 8 00:40:29 2007 From: paul at boddie.org.uk (Paul Boddie) Date: Wed, 08 Aug 2007 00:40:29 +0200 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186507034.3370.31.camel@dot.uniqsys.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> Message-ID: <200708080040.29332.paul@boddie.org.uk> On Tuesday 07 August 2007 19:17, Carsten Haese wrote: > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > > > When i do: > > insert into tablename(id,desc)VALUES(1,'some text') > > > > How do I escape 'desc'? > > insert into tablename(id,`desc`) ... Obviously MySQL supports the above, but I believe that the standard way is to use double quotes: insert into tablename (id, "desc") values (1, 'some text') For your information, of course. Paul From carsten at uniqsys.com Wed Aug 8 03:37:10 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 07 Aug 2007 21:37:10 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <200708080040.29332.paul@boddie.org.uk> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <200708080040.29332.paul@boddie.org.uk> Message-ID: <1186537030.3253.7.camel@localhost.localdomain> On Wed, 2007-08-08 at 00:40 +0200, Paul Boddie wrote: > On Tuesday 07 August 2007 19:17, Carsten Haese wrote: > > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > > > > > When i do: > > > insert into tablename(id,desc)VALUES(1,'some text') > > > > > > How do I escape 'desc'? > > > > insert into tablename(id,`desc`) ... > > Obviously MySQL supports the above, but I believe that the standard way is to > use double quotes: > > insert into tablename (id, "desc") values (1, 'some text') That is the PostgreSQL way. The standard way (at least as far as Informix understands it) is not to quote table/column names at all and let the parser worry about determining whether the word it's looking at is the name of a thing or a keyword. And worry it will, if you are insane enough to write queries like "select select as as from from where where = 1", which is valid SQL given the right schema, and it's the reason why writing a standards compliant SQL parser is a pain in the neck. -- Carsten Haese http://informixdb.sourceforge.net From carsten at uniqsys.com Wed Aug 8 03:58:22 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 07 Aug 2007 21:58:22 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186537030.3253.7.camel@localhost.localdomain> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <200708080040.29332.paul@boddie.org.uk> <1186537030.3253.7.camel@localhost.localdomain> Message-ID: <1186538302.3253.12.camel@localhost.localdomain> On Tue, 2007-08-07 at 21:37 -0400, Carsten Haese wrote: > [...] The standard way (at least as far as > Informix understands it) is not to quote table/column names at all and > let the parser worry about determining whether the word it's looking at > is the name of a thing or a keyword. And worry it will, if you are > insane enough to write queries like "select select as as from from where > where = 1", which is valid SQL given the right schema, and it's the > reason why writing a standards compliant SQL parser is a pain in the > neck. Here's proof that this insane query really works, as long as you take my word for it that this transcript isn't doctored: Python 2.5 (r25:51908, Oct 28 2006, 12:26:14) [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import informixdb >>> conn = informixdb.connect("ifxtest") >>> cur = conn.cursor() >>> cur.execute("create temp table from (select int, where int)") -1 >>> cur.execute("insert into from values(1,1)") 1 >>> cur.execute("insert into from values(2,2)") 1 >>> cur.execute("select select as as from from where where = 1") >>> cur.fetchall() [(1,)] -- Carsten Haese http://informixdb.sourceforge.net From andy47 at halfcooked.com Wed Aug 8 01:14:09 2007 From: andy47 at halfcooked.com (Andy Todd) Date: Wed, 08 Aug 2007 09:14:09 +1000 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <804e5c70708071140tf9d02a3o6d571f9b9cf30265@mail.gmail.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> <1186510253.3370.35.camel@dot.uniqsys.com> <804e5c70708071140tf9d02a3o6d571f9b9cf30265@mail.gmail.com> Message-ID: <46B8FCC1.4080609@halfcooked.com> Lukasz Szybalski wrote: > On 8/7/07, Carsten Haese wrote: >> On Tue, 2007-08-07 at 13:06 -0500, Lukasz Szybalski wrote: >>> On 8/7/07, Carsten Haese wrote: >>>> On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: >>>>> Hello, >>>>> I have installed mysqldb python bindings and I am using python to >>>>> write to mysql. >>>>> >>>>> I have a field called "Desc" in a database (short for description ) >>>>> But this name is used by mysql for sorting DESC >>>>> >>>>> When i do: >>>>> insert into tablename(id,desc)VALUES(1,'some text') >>>>> >>>>> How do I escape 'desc'? >>>> insert into tablename(id,`desc`) ... >>>> >>>>>> conn=MySQLdb.connect( SERVER, USER, PASS, DB ) >>>>>> c=conn.cursor() >>>>>> c.execute("insert into tablename('desc')Values('sss')") >> You are quoting the name in apostrophes (ascii character 39). You should >> be using backwards apostrophes (ascii character 96). > > Thank you, > That worked. > Lucas > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > http://mail.python.org/mailman/listinfo/db-sig And as a general rule of thumb, don't use reserved words as column names in your database. Section 9.3 of the MySQL 5.1 reference manual lists all of these reserved words. I believe there is a similar section in the manuals for the other releases although I don't have any to hand at the moment. Regards, Andy -- From the desk of Andrew J Todd esq - http://www.halfcooked.com/ From paul at snake.net Wed Aug 8 05:10:22 2007 From: paul at snake.net (Paul DuBois) Date: Tue, 07 Aug 2007 22:10:22 -0500 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <46B8FCC1.4080609@halfcooked.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <804e5c70708071106k206816bfka67f49a14a0637ee@mail.gmail.com> <1186510253.3370.35.camel@dot.uniqsys.com> <804e5c70708071140tf9d02a3o6d571f9b9cf30265@mail.gmail.com> <46B8FCC1.4080609@halfcooked.com> Message-ID: <46B9341E.1010908@snake.net> Andy Todd wrote: > Lukasz Szybalski wrote: >> On 8/7/07, Carsten Haese wrote: >>> On Tue, 2007-08-07 at 13:06 -0500, Lukasz Szybalski wrote: >>>> On 8/7/07, Carsten Haese wrote: >>>>> On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: >>>>>> Hello, >>>>>> I have installed mysqldb python bindings and I am using python to >>>>>> write to mysql. >>>>>> >>>>>> I have a field called "Desc" in a database (short for description ) >>>>>> But this name is used by mysql for sorting DESC >>>>>> >>>>>> When i do: >>>>>> insert into tablename(id,desc)VALUES(1,'some text') >>>>>> >>>>>> How do I escape 'desc'? >>>>> insert into tablename(id,`desc`) ... >>>>> >>>>>>> conn=MySQLdb.connect( SERVER, USER, PASS, DB ) >>>>>>> c=conn.cursor() >>>>>>> c.execute("insert into tablename('desc')Values('sss')") >>> You are quoting the name in apostrophes (ascii character 39). You should >>> be using backwards apostrophes (ascii character 96). >> Thank you, >> That worked. >> Lucas >> _______________________________________________ >> DB-SIG maillist - DB-SIG at python.org >> http://mail.python.org/mailman/listinfo/db-sig > > And as a general rule of thumb, don't use reserved words as column names > in your database. Section 9.3 of the MySQL 5.1 reference manual lists > all of these reserved words. I believe there is a similar section in the > manuals for the other releases although I don't have any to hand at the > moment. http://dev.mysql.com/doc/refman/4.1/en/reserved-words.html http://dev.mysql.com/doc/refman/5.0/en/reserved-words.html http://dev.mysql.com/doc/refman/5.1/en/reserved-words.html The identifier-quoting rules are here: http://dev.mysql.com/doc/refman/5.0/en/identifiers.html which says, basically, that identifiers can be quoted with backticks. If the ANSI_QUOTES SQL mode is enabled, you can quote identifiers with backticks or double quotes. From carsten at uniqsys.com Wed Aug 8 15:00:09 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 08 Aug 2007 09:00:09 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186537030.3253.7.camel@localhost.localdomain> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <200708080040.29332.paul@boddie.org.uk> <1186537030.3253.7.camel@localhost.localdomain> Message-ID: <1186578009.3383.16.camel@dot.uniqsys.com> On Tue, 2007-08-07 at 21:37 -0400, Carsten Haese wrote: > On Wed, 2007-08-08 at 00:40 +0200, Paul Boddie wrote: > > On Tuesday 07 August 2007 19:17, Carsten Haese wrote: > > > On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: > > > > > > > > When i do: > > > > insert into tablename(id,desc)VALUES(1,'some text') > > > > > > > > How do I escape 'desc'? > > > > > > insert into tablename(id,`desc`) ... > > > > Obviously MySQL supports the above, but I believe that the standard way is to > > use double quotes: > > > > insert into tablename (id, "desc") values (1, 'some text') > > That is the PostgreSQL way. The standard way (at least as far as > Informix understands it) is not to quote table/column names at all and > let the parser worry about determining whether the word it's looking at > is the name of a thing or a keyword. I looked up the standard and must admit that Informix's behavior in this regard is non-standard, at least according to SQL92. SQL92 states quite unequivocally that "The identifier body of a regular identifier [...] shall not be equal [...] to any reserved word." It furthermore states that delimited identifiers are delimited by double quotes. Maybe Informix is keeping an artifact from pre-SQL times for backwards compatibility. Anyway, I just wanted to set the record straight. Paul was correct in stating that the standard way of quoting identifiers is to use double quotes. -- Carsten Haese http://informixdb.sourceforge.net From aprotin at research.att.com Wed Aug 8 14:50:36 2007 From: aprotin at research.att.com (Art Protin) Date: Wed, 08 Aug 2007 08:50:36 -0400 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186538302.3253.12.camel@localhost.localdomain> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <200708080040.29332.paul@boddie.org.uk> <1186537030.3253.7.camel@localhost.localdomain> <1186538302.3253.12.camel@localhost.localdomain> Message-ID: <46B9BC1C.5070303@research.att.com> Dear folks, The database system I am working with here uses the caret, "^", as in "SELECT ^Select^ AS ^AS^ FROM ^FROM^ WHERE ^Where^ = 1" and this is why JDBC has the method DatabaseMetaData.getIdentifierQuoteString(). So, is JDBC ugly because it is Java or because it is SQL or both? Thank you Carsten for such an illustrative example. I will have to share it with all my co-workers here. Carsten Haese wrote: >On Tue, 2007-08-07 at 21:37 -0400, Carsten Haese wrote: > > >>[...] The standard way (at least as far as >>Informix understands it) is not to quote table/column names at all and >>let the parser worry about determining whether the word it's looking at >>is the name of a thing or a keyword. And worry it will, if you are >>insane enough to write queries like "select select as as from from where >>where = 1", which is valid SQL given the right schema, and it's the >>reason why writing a standards compliant SQL parser is a pain in the >>neck. >> >> > >Here's proof that this insane query really works, as long as you take my >word for it that this transcript isn't doctored: > >Python 2.5 (r25:51908, Oct 28 2006, 12:26:14) >[GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2 >Type "help", "copyright", "credits" or "license" for more information. > > >>>>import informixdb >>>>conn = informixdb.connect("ifxtest") >>>>cur = conn.cursor() >>>>cur.execute("create temp table from (select int, where int)") >>>> >>>> >-1 > > >>>>cur.execute("insert into from values(1,1)") >>>> >>>> >1 > > >>>>cur.execute("insert into from values(2,2)") >>>> >>>> >1 > > >>>>cur.execute("select select as as from from where where = 1") >>>>cur.fetchall() >>>> >>>> >[(1,)] > > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070808/800fd473/attachment.html From anthony.tuininga at gmail.com Wed Aug 8 15:54:37 2007 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Wed, 8 Aug 2007 07:54:37 -0600 Subject: [DB-SIG] ceODBC 1.1 Message-ID: <703ae56b0708080654i7b06ff12sc1193edc155c8d5f@mail.gmail.com> What is ceODBC? ceODBC is a Python extension module that enables access to databases using the ODBC API and conforms to the Python database API 2.0 specifications with a few additions. I have tested this on Windows against SQL Server, Access, dBASE and Oracle. On Linux I have tested this against PostgreSQL. Where do I get it? http://ceodbc.sourceforge.net What's new? 1) Added support for determining the columns, column privileges, foreign keys, primary keys, procedures, procedure columns, tables and table privileges available in the catalog as requested by Dmitry Selitsky. 2) Added support for getting/setting the autocommit flag for connections. 3) Added support for getting/setting the cursor name which is useful for performing positioned updates and deletes (as in delete from X where current of cursorname). 4) Explicitly set end of rows when SQL_NO_DATA is returned from SQLFetch() as some drivers do not properly set the number of rows fetched. Anthony From paul at snake.net Wed Aug 8 16:32:09 2007 From: paul at snake.net (Paul DuBois) Date: Wed, 08 Aug 2007 09:32:09 -0500 Subject: [DB-SIG] How to escape special field name, mysql? In-Reply-To: <1186578009.3383.16.camel@dot.uniqsys.com> References: <804e5c70708070958u5ae57264i32fd530ced6124b2@mail.gmail.com> <1186507034.3370.31.camel@dot.uniqsys.com> <200708080040.29332.paul@boddie.org.uk> <1186537030.3253.7.camel@localhost.localdomain> <1186578009.3383.16.camel@dot.uniqsys.com> Message-ID: <46B9D3E9.1060907@snake.net> Carsten Haese wrote: > On Tue, 2007-08-07 at 21:37 -0400, Carsten Haese wrote: >> On Wed, 2007-08-08 at 00:40 +0200, Paul Boddie wrote: >>> On Tuesday 07 August 2007 19:17, Carsten Haese wrote: >>>> On Tue, 2007-08-07 at 11:58 -0500, Lukasz Szybalski wrote: >>>>> When i do: >>>>> insert into tablename(id,desc)VALUES(1,'some text') >>>>> >>>>> How do I escape 'desc'? >>>> insert into tablename(id,`desc`) ... >>> Obviously MySQL supports the above, but I believe that the standard way is to >>> use double quotes: >>> >>> insert into tablename (id, "desc") values (1, 'some text') >> That is the PostgreSQL way. The standard way (at least as far as >> Informix understands it) is not to quote table/column names at all and >> let the parser worry about determining whether the word it's looking at >> is the name of a thing or a keyword. > > I looked up the standard and must admit that Informix's behavior in this > regard is non-standard, at least according to SQL92. SQL92 states quite > unequivocally that "The identifier body of a regular identifier [...] > shall not be equal [...] to any reserved word." It furthermore states > that delimited identifiers are delimited by double quotes. > > Maybe Informix is keeping an artifact from pre-SQL times for backwards > compatibility. Anyway, I just wanted to set the record straight. Paul > was correct in stating that the standard way of quoting identifiers is > to use double quotes. > Note that although MySQL does support using double quotes for quoted identifierss (if the ANSI_QUOTES SQL mode is enabled), a difference between MySQL and standard SQL in this case is that quoted delimiters in standard SQL are case sensitive, whereas in MySQL they are not. From mwm-keyword-dbsig.588a7d at mired.org Fri Aug 10 22:11:49 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Fri, 10 Aug 2007 16:11:49 -0400 Subject: [DB-SIG] In praise of pyformat Message-ID: <20070810161149.049743ea@bhuda.mired.org> Maybe this is late, and the issues are already settled, but it seemed that nobody spoke up for them, so I figured I ought to. There are two downsides given for the format and pyformat paramstyles. They are: 1) It confuses newbies, and they wind up building queries in Python instead of using parameters. 2) You have to use '%%' to get a real '%' into the query. #1 is a bad thing. But the reason it happens is because python programmers are already familiar with the syntax from python. This is a *good* thing. While being friendly to newbies is a good thing, it's not clear to me that introducing yet another syntax that has to be learned as a cure isn't worse than the disease. Sure, if you're a db person instead of a Python person and the format is the one your db uses, then this cuts the other way - but in that case, you should also know the difference between using parameters and trying to build the query from untrustworthy data yourself. As for #2, yes, you have to use '%%' to insert a single '%'. How do the other paramstyles deal with wanting to get their significant character into the query? From reading the last few months of archives, it seems that they don't. The db module author either expects that that won't happen, or is expected to recognize those characters in some db-dependent way. If the goal is to be able to write code that will port between db modules - or even databases - without modification, then having a defined way to deal with this issue is clearly better than punting to the module authors. After all, it's the latter that created the current situation. As a final note, on backwards compatibility - that was amusing. The two sides were trying to define two things with one definition. Let me try with two definitions: A module is backwards compatible if code that used the documented APIs of the previous version of the module will work unchanged with the current version. A specification is backwards compatible if it is possible for an implementation of the current version to be backwards compatible with an implementation of the previous version. I think both of these are desirable properties. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Sat Aug 11 02:24:17 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Fri, 10 Aug 2007 20:24:17 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070810161149.049743ea@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> Message-ID: <1186791857.3256.36.camel@localhost.localdomain> On Fri, 2007-08-10 at 16:11 -0400, Mike Meyer wrote: > Maybe this is late, and the issues are already settled, but it seemed > that nobody spoke up for them, so I figured I ought to. > > There are two downsides given for the format and pyformat > paramstyles. They are: > > 1) It confuses newbies, and they wind up building queries in Python > instead of using parameters. > > 2) You have to use '%%' to get a real '%' into the query. > > #1 is a bad thing. [...] > > As for #2, yes, you have to use '%%' to insert a single '%'. How do > the other paramstyles deal with wanting to get their significant > character into the query? From reading the last few months of > archives, it seems that they don't. Actually, they do (or at least they should, and InformixDB certainly does). SQL already has a standard way of treating parameter markers as literals: Apostrophes. A question mark is a literal question mark if and only if it occurs inside apostrophes. Otherwise, it's a parameter marker. The same principle is true for colon-identifier tokens. The same principle would also apply to percent signs, but somehow it didn't occur to anybody to require (py)format modules authors to make accommodations for percent signs appearing inside literal strings, instead placing the burden on application developers. > The db module author either > expects that that won't happen, or is expected to recognize those > characters in some db-dependent way. It does happen. For example, Informix has a "matches" operator that uses literal question marks as wild cards for single characters, similar to glob matching. Hence, both select * from persons where name matches '?' and select * from persons where name matches ? are valid queries with very different meanings. In the former, any single-character name is matched, and the question mark is not a parameter marker. In the latter, the name is matched to whatever pattern is given as the parameter. > If the goal is to be able to > write code that will port between db modules - or even databases - > without modification, then having a defined way to deal with this > issue is clearly better than punting to the module authors. After all, > it's the latter that created the current situation. As I said, there is a defined way: Don't treat things that look like parameter markers as parameter markers if they appear inside apostrophes. This may require a simple parser in the API module, but I prefer placing a burden on a dozen API module authors over placing a burden on thousands of application developers. -- Carsten Haese http://informixdb.sourceforge.net From mwm-keyword-dbsig.588a7d at mired.org Sat Aug 11 10:25:17 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sat, 11 Aug 2007 04:25:17 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1186791857.3256.36.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> Message-ID: <20070811042517.5597d2bf@bhuda.mired.org> On Fri, 10 Aug 2007 20:24:17 -0400 Carsten Haese wrote: > On Fri, 2007-08-10 at 16:11 -0400, Mike Meyer wrote: > > 2) You have to use '%%' to get a real '%' into the query. > > As for #2, yes, you have to use '%%' to insert a single '%'. How do > > the other paramstyles deal with wanting to get their significant > > character into the query? From reading the last few months of > > archives, it seems that they don't. > Actually, they do (or at least they should, and InformixDB certainly > does). SQL already has a standard way of treating parameter markers as > literals: Apostrophes. A question mark is a literal question mark if and > only if it occurs inside apostrophes. Otherwise, it's a parameter > marker. The same principle is true for colon-identifier tokens. Well, InformixDB has a way that works. I can think of at least two other databases that it won't work for - or at least, will have surprising consequences for users. That's not the same thing as the paramstyle having a way. > > If the goal is to be able to > > write code that will port between db modules - or even databases - > > without modification, then having a defined way to deal with this > > issue is clearly better than punting to the module authors. After all, > > it's the latter that created the current situation. > As I said, there is a defined way: Don't treat things that look like > parameter markers as parameter markers if they appear inside > apostrophes. This may require a simple parser in the API module, but I > prefer placing a burden on a dozen API module authors over placing a > burden on thousands of application developers. Well, that's a way. The problem is, it's not defined in the PEP. Other modules don't do that (since pysqlite is bundled these days, it's easy to verify that indeed, it doesn't behave this way). I do agree that the module authors ought to have to deal with this rather than the users. But letting each module author do it however they feel is best only helps users who are writing for a specific module; users trying to write code that's portable between modules and/or databases are better served by a specification that applies to all modules, even if it adds the burden of flagging significant characters as literal. If a new version of the PEP is going to require supporting a parameter style - which I believe is a good thing - it should be one that's at least as explicit as the most explicit of the current parameter styles. Which means the PEP needs to lay out the rules for when the parameters are recognized as such and when they aren't. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Sat Aug 11 15:13:57 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 11 Aug 2007 09:13:57 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070811042517.5597d2bf@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> Message-ID: <1186838037.3257.9.camel@localhost.localdomain> On Sat, 2007-08-11 at 04:25 -0400, Mike Meyer wrote: > On Fri, 10 Aug 2007 20:24:17 -0400 Carsten Haese wrote: > > As I said, there is a defined way: Don't treat things that look like > > parameter markers as parameter markers if they appear inside > > apostrophes. This may require a simple parser in the API module, but I > > prefer placing a burden on a dozen API module authors over placing a > > burden on thousands of application developers. > > Well, that's a way. The problem is, it's not defined in the PEP. Other > modules don't do that (since pysqlite is bundled these days, it's easy > to verify that indeed, it doesn't behave this way). Sure, let's verify: >>> import sqlite3 >>> conn = sqlite3.connect(":memory:") >>> cur = conn.cursor() >>> cur.execute("create table t1(c1 varchar(20))") >>> cur.execute("insert into t1(c1) values('?')") >>> cur.execute("select * from t1").fetchall() [(u'?',)] >>> cur.execute("insert into t1(c1) values(?)") Traceback (most recent call last): File "", line 1, in sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 0 supplied. Maybe I'm misunderstanding what you mean by "behave this way", but it certainly looks like it's doing what I described as "Don't treat things that look like parameter markers as parameter markers if they appear inside apostrophes." -- Carsten Haese http://informixdb.sourceforge.net From mwm-keyword-dbsig.588a7d at mired.org Sat Aug 11 18:12:57 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sat, 11 Aug 2007 12:12:57 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1186838037.3257.9.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> <1186838037.3257.9.camel@localhost.localdomain> Message-ID: <20070811121257.628d257f@bhuda.mired.org> On Sat, 11 Aug 2007 09:13:57 -0400 Carsten Haese wrote: > On Sat, 2007-08-11 at 04:25 -0400, Mike Meyer wrote: > > On Fri, 10 Aug 2007 20:24:17 -0400 Carsten Haese wrote: > > > As I said, there is a defined way: Don't treat things that look like > > > parameter markers as parameter markers if they appear inside > > > apostrophes. This may require a simple parser in the API module, but I > > > prefer placing a burden on a dozen API module authors over placing a > > > burden on thousands of application developers. > > > > Well, that's a way. The problem is, it's not defined in the PEP. Other > > modules don't do that (since pysqlite is bundled these days, it's easy > > to verify that indeed, it doesn't behave this way). > > Sure, let's verify: > > >>> import sqlite3 > >>> conn = sqlite3.connect(":memory:") > >>> cur = conn.cursor() > >>> cur.execute("create table t1(c1 varchar(20))") > > >>> cur.execute("insert into t1(c1) values('?')") > > >>> cur.execute("select * from t1").fetchall() > [(u'?',)] > >>> cur.execute("insert into t1(c1) values(?)") > Traceback (most recent call last): > File "", line 1, in > sqlite3.ProgrammingError: Incorrect number of bindings supplied. The > current statement uses 1, and there are 0 supplied. > > Maybe I'm misunderstanding what you mean by "behave this way", but it > certainly looks like it's doing what I described as "Don't treat things > that look like parameter markers as parameter markers if they appear > inside apostrophes." The definition was vague, and I assumed "treat all other things that look like parameter markers as parameter markers." Possibly that's wrong, and there are other cases. But that's what breaks: >>> c.execute('insert into foo (id) values ("?")', [23]) Traceback (most recent call last): File "", line 1, in sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 0, and there are 1 supplied. Likewise, I understand that Informix has a couple of cases where it uses colons outside of literals. So if I need one of those, we're back to the question of "How do I make the apostrophes go away?". A rule that only needs to be applied sometimes, and when those times are varies from module to module isn't much better than no rule at all. Actually, I think that highlights the while problem with trying to take this burden off the user: you're making when things that look like parameter markers are really parameter markers implicit rather than explicit. If there's a simple-to-apply rule (that's in the spec so we can depend on every module to use it and the users know what it is), this isn't so bad. But there also needs to be a way to deal with cases where you need a parameter marker that's not covered by the rule. format/pyformats rule - "If you need a bare %, put in two" is not only simple for the module implementors, it's simple for the users to deal with. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From paul at boddie.org.uk Sat Aug 11 18:14:16 2007 From: paul at boddie.org.uk (Paul Boddie) Date: Sat, 11 Aug 2007 18:14:16 +0200 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070811042517.5597d2bf@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> Message-ID: <200708111814.16924.paul@boddie.org.uk> On Saturday 11 August 2007 10:25, Mike Meyer wrote: > > Well, InformixDB has a way that works. I can think of at least two > other databases that it won't work for - or at least, will have > surprising consequences for users. That's not the same thing as the > paramstyle having a way. Can you name those database systems? The DB-API should support parameters without additional quoting. If, say, PostgreSQL doesn't support parameters internally, the module should prevent the programmer from knowing about that. [...] > I do agree that the module authors ought to have to deal with this > rather than the users. But letting each module author do it however > they feel is best only helps users who are writing for a specific > module; users trying to write code that's portable between modules > and/or databases are better served by a specification that applies to > all modules, even if it adds the burden of flagging significant > characters as literal. As you may have seen, I wrote some code which attempts to mark parts of queries as literal and non-literal text: http://www.python.org/pypi/sqlliterals This should help module developers know whether a parameter marker is genuine or not. > If a new version of the PEP is going to require supporting a parameter > style - which I believe is a good thing - it should be one that's at > least as explicit as the most explicit of the current parameter > styles. Which means the PEP needs to lay out the rules for when the > parameters are recognized as such and when they aren't. You previously asked the following question: "How do the other paramstyles deal with wanting to get their significant character into the query?" The answer is that at the application level you never have the problem of getting, for example, a question mark into the query: it's either inside a string literal, meaning that it's protected from any interpretation as a parameter marker (1), or it's supplied as part of a value which is passed to the database system in association with a parameter (2). Examples: # (1) cursor.execute("select name from addresses where status = '?' order by name") # (2) cursor.execute("select name from addresses where status = ? order by name", ("?",)) And where a question mark (or other parameter marker) appears outside a string literal, it's unambiguously interpreted as a parameter marker since there should be no other interpretation of that character outside literals. Of course, choosing a character sequence which may plausibly be part of some other syntactic feature would defeat such simple measures for knowing whether that sequence was a parameter marker or not, but the question mark should be a satisfactory choice according to the standard. If one considers parameters as locations for string substitution, then I can see how your question arises, but they are not mere substitutions, even if a database module or system might use that mechanism to achieve the effect (due to a lack of "proper" parameter support internally). The pyformat paramstyle is confusing because it's used to achieve something that resembles the effect of another mechanism - string substitution - but is actually somewhat different (as the above explanation attempts to make clear). Meanwhile, some of the other paramstyles are legitimate standards that can be seen in a number of widespread technologies. Paul From mwm-keyword-dbsig.588a7d at mired.org Sat Aug 11 21:32:53 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sat, 11 Aug 2007 15:32:53 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <200708111814.16924.paul@boddie.org.uk> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> <200708111814.16924.paul@boddie.org.uk> Message-ID: <20070811153253.171663a7@bhuda.mired.org> On Sat, 11 Aug 2007 18:14:16 +0200 Paul Boddie wrote: > On Saturday 11 August 2007 10:25, Mike Meyer wrote: > > > > Well, InformixDB has a way that works. I can think of at least two > > other databases that it won't work for - or at least, will have > > surprising consequences for users. That's not the same thing as the > > paramstyle having a way. > Can you name those database systems? The DB-API should support parameters > without additional quoting. If, say, PostgreSQL doesn't support parameters > internally, the module should prevent the programmer from knowing about that. A number of open source db systems allow both single and double quotes for literals, which breaks the "not inside apostrophes" rule that was proposed. And from what's in the dbsig archives, informix uses ':' in a couple of different places. > > I do agree that the module authors ought to have to deal with this > > rather than the users. But letting each module author do it however > > they feel is best only helps users who are writing for a specific > > module; users trying to write code that's portable between modules > > and/or databases are better served by a specification that applies to > > all modules, even if it adds the burden of flagging significant > > characters as literal. > > As you may have seen, I wrote some code which attempts to mark parts of > queries as literal and non-literal text: > > http://www.python.org/pypi/sqlliterals > > This should help module developers know whether a parameter marker is genuine > or not. I'm not all that worried about what module developers have to do - I'm more worried about what I, as a user of the module, have to do. If I need to understand your sqlliterals code to use parameter quoting to guard against data insertion attacks, have I really gained anything? > > If a new version of the PEP is going to require supporting a parameter > > style - which I believe is a good thing - it should be one that's at > > least as explicit as the most explicit of the current parameter > > styles. Which means the PEP needs to lay out the rules for when the > > parameters are recognized as such and when they aren't. > You previously asked the following question: "How do the other paramstyles > deal with wanting to get their significant character into the query?" The > answer is that at the application level you never have the problem of > getting, for example, a question mark into the query: it's either inside a > string literal, meaning that it's protected from any interpretation as a > parameter marker (1), or it's supplied as part of a value which is passed to > the database system in association with a parameter (2). Actually, I think you can get a bit tighter than that. They aren't merely "part of a value", they are a token in the value. I.e., if you define token as "characters valid in a name plus ? and :" then any token that isn't either a single ? or .startswith(':') isn't a parameter. I'm not sure how that works with the informix uses, though. > And where a question mark (or other parameter marker) appears outside a string > literal, it's unambiguously interpreted as a parameter marker since there > should be no other interpretation of that character outside literals. Except there are cases where that's not so. As previously mentioned, informix apparently uses ':' for a couple of things. And consider this: >>> c.execute("insert into 'FOO' ('ID') values ('hello')") This query has three string literals. Let's start replacing them... >>> c.execute("insert into 'FOO' ('ID') values (?)", ['hello']) ok so far. >>> c.execute("insert into 'FOO' (?) values (?)", ['ID', 'hello']) Traceback (most recent call last): File "", line 1, in sqlite3.OperationalError: near "?": syntax error Whoops. >>> c.execute("insert into ? ('ID') values (?)", ['FOO', 'hello']) Traceback (most recent call last): File "", line 1, in sqlite3.OperationalError: near "?": syntax error And again it doesn't substitute. So these literal strings aren't values? Even though in some cases I'm forced to quote them (and the cases vary from db to db, and how you quote them varies, and .....)? Or maybe with cx_Oracle: >>> c.execute("""select count(*) from log_metric where "METRIC_STRING" = 'pages swapped out'""") [] >>> c.fetchall() [(189,)] >>> c.execute("""select count(*) from log_metric where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING')) [] >>> c.fetchall() [(0,)] Hmm. Wrong answer. Let's try a different tack: >>> c.execute("""select count(*) from log_metric where ":colname" = 'pages swapped out'""", dict(colname='METRIC_STRING')) Traceback (most recent call last): File "", line 1, in cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number Nope. And for completeness: >>> c.execute("""select count(*) from :tabname where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname="LOG_METRIC")) Traceback (most recent call last): File "", line 1, in cx_Oracle.DatabaseError: ORA-00903: invalid table name >>> c.execute("""select count(*) from :tabname where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname='"LOG_METRIC"')) Traceback (most recent call last): File "", line 1, in cx_Oracle.DatabaseError: ORA-00903: invalid table name >>> c.execute("""select count(*) from ":tabname" where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname="LOG_METRIC")) Traceback (most recent call last): File "", line 1, in cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number Uh... ugh. > If one considers parameters as locations for string substitution, then I can > see how your question arises I'm looking at trying to use it in automatically generated SQL code. For example, I want to use an sql database as the "desktop" half of pda database. The pda database is designed to go well with the pda, so table/field names have different rules, and I have to deal with that. Or (the need that's got me looking into this) I'm logging errors from an application via a systems monitoring tool, and the table/field names have to work with the systems monitoring tool. So the SQL - including the table and column names in the definition - is getting built on the fly. I can see that if I were writing the queries by hand and knew all the table/column names in advance, none of this would matter. But I'm not, so it does. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Sat Aug 11 23:10:34 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 11 Aug 2007 17:10:34 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070811153253.171663a7@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> <200708111814.16924.paul@boddie.org.uk> <20070811153253.171663a7@bhuda.mired.org> Message-ID: <1186866634.3318.48.camel@localhost.localdomain> On Sat, 2007-08-11 at 15:32 -0400, Mike Meyer wrote: > On Sat, 11 Aug 2007 18:14:16 +0200 Paul Boddie wrote: > > On Saturday 11 August 2007 10:25, Mike Meyer wrote: > > > > > > Well, InformixDB has a way that works. I can think of at least two > > > other databases that it won't work for - or at least, will have > > > surprising consequences for users. That's not the same thing as the > > > paramstyle having a way. > > Can you name those database systems? The DB-API should support parameters > > without additional quoting. If, say, PostgreSQL doesn't support parameters > > internally, the module should prevent the programmer from knowing about that. > > A number of open source db systems allow both single and double quotes > for literals, which breaks the "not inside apostrophes" rule that was > proposed. And from what's in the dbsig archives, informix uses ':' in > a couple of different places. Apparently I've confused you by simplifying the situation. The iron-clad, all-encompassing, golden rule is this: If something looks like a parameter marker and occurs in a place where a parameter marker is syntactically allowed, it's a parameter marker. Anything else isn't. And in case that's not clear, parameter markers are not allowed inside string literals or delimited column names. Informix does use colons as part of the syntax in two places: 1) Datetime literals e.g. DATETIME(12:34:56) HOUR TO SECOND 2) Remote table references of the form databasename:tablename. To protect from false positives, the parser in the InformixDB API sees colons that follow alphanumeric characters as literal parts of the query, and not as parameter placeholders. The basic rule above still holds true, because you can't have a parameter placeholder inside a datetime literal or in a table name. The parser just uses a heuristic because writing a full SQL parser would be insane. > [...] > > And where a question mark (or other parameter marker) appears outside a string > > literal, it's unambiguously interpreted as a parameter marker since there > > should be no other interpretation of that character outside literals. > > Except there are cases where that's not so. As previously mentioned, > informix apparently uses ':' for a couple of things. And consider this: > > >>> c.execute("insert into 'FOO' ('ID') values ('hello')") > > > This query has three string literals. Let's start replacing them... The fact that the query works with string literals for table names and column names is mildly surprising. That must be an Sqlite peculiarity. It's definitely not standard SQL. > >>> c.execute("insert into 'FOO' ('ID') values (?)", ['hello']) > > > ok so far. > > >>> c.execute("insert into 'FOO' (?) values (?)", ['ID', 'hello']) > Traceback (most recent call last): > File "", line 1, in > sqlite3.OperationalError: near "?": syntax error > > Whoops. Whoops indeed. A column name must be an identifier. Parameter markers are not identifiers. See the golden rule above. > >>> c.execute("insert into ? ('ID') values (?)", ['FOO', 'hello']) > Traceback (most recent call last): > File "", line 1, in > sqlite3.OperationalError: near "?": syntax error > > And again it doesn't substitute. Again, table names must be identifiers, yada yada. Also, I object to your use of the word "substitute", since it implies that parameter binding is a string formatting exercise. In most database engines, parameters are not substituted into the query text. Instead, the parameter values are "bound" to the placeholders in the query and transmitted to the database separately from the query. > So these literal strings aren't values? Even though in some cases I'm > forced to quote them (and the cases vary from db to db, and how you > quote them varies, and .....)? > > Or maybe with cx_Oracle: > > >>> c.execute("""select count(*) from log_metric where "METRIC_STRING" = 'pages swapped out'""") > [] > >>> c.fetchall() > [(189,)] > >>> c.execute("""select count(*) from log_metric where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING')) > [] > >>> c.fetchall() > [(0,)] > > Hmm. Wrong answer. Let's try a different tack: Right answer, wrong question. In the first case, you're counting where the contents of the column METRIC_STRING equal 'pages swapped out'. In the second case, you're counting where the string 'METRIC_STRING' (bound as a parameter value) equals 'pages swapped out'. Different question, different answer. > >>> c.execute("""select count(*) from log_metric where ":colname" = 'pages swapped out'""", dict(colname='METRIC_STRING')) > Traceback (most recent call last): > File "", line 1, in > cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number > > Nope. And for completeness: That's because ":colname" is not a parameter placeholder, it is a delimited column name, referencing a non-existent column called :colname. Additionally, you're supplying a parameter value that has no corresponding placeholder, which cx_Oracle doesn't allow. > >>> c.execute("""select count(*) from :tabname where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname="LOG_METRIC")) > Traceback (most recent call last): > File "", line 1, in > cx_Oracle.DatabaseError: ORA-00903: invalid table name > >>> c.execute("""select count(*) from :tabname where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname='"LOG_METRIC"')) > Traceback (most recent call last): > File "", line 1, in > cx_Oracle.DatabaseError: ORA-00903: invalid table name > >>> c.execute("""select count(*) from ":tabname" where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING', tabname="LOG_METRIC")) > Traceback (most recent call last): > File "", line 1, in > cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number > > Uh... ugh. And all of these examples are covered by my previous explanations. > I'm looking at trying to use it in automatically generated SQL > code. For example, I want to use an sql database as the "desktop" half > of pda database. The pda database is designed to go well with the pda, > so table/field names have different rules, and I have to deal with > that. Or (the need that's got me looking into this) I'm logging errors > from an application via a systems monitoring tool, and the table/field > names have to work with the systems monitoring tool. So the SQL - > including the table and column names in the definition - is getting > built on the fly. I can see that if I were writing the queries by hand > and knew all the table/column names in advance, none of this would > matter. But I'm not, so it does. Well, as you hopefully know by now, in real SQL databases you can't use parameter binding to get a table name or column name into a query. The explanation for this is relatively simple. One major reason for the existence of dynamic parameters is the ability to run prepared queries: Let the database parse and plan the query once, and then execute the query many times over with different actual values. Since the table names and column names affect the query plan, they must not be supplied by parameters. The fact that (py)format implementations of DB-API generally do allow parameter markers for identifiers is an implementation artifact of using string formatting to achieve poor-man's parameter binding. To execute a query where table names and column names are variable, you should use string formatting yourself to build the structure of the query, placing parameter markers into the structure where necessary, and then using parameter binding to supply the actual values. For example: def insert_row(cur, tablename, valuesdict): columns = valuesdict.keys() params = [':%s'%c for c in columns] query = "insert into %s(%s) values (%s)" % ( tablename, ','.join(columns), ','.join(params)) cur.execute(query, valuesdict) Hope this helps, -- Carsten Haese http://informixdb.sourceforge.net From mwm-keyword-dbsig.588a7d at mired.org Sun Aug 12 00:33:27 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sat, 11 Aug 2007 18:33:27 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1186866634.3318.48.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> <200708111814.16924.paul@boddie.org.uk> <20070811153253.171663a7@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> Message-ID: <20070811183327.0f4b16bc@bhuda.mired.org> On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese wrote: > On Sat, 2007-08-11 at 15:32 -0400, Mike Meyer wrote: > > On Sat, 11 Aug 2007 18:14:16 +0200 Paul Boddie wrote: > > > On Saturday 11 August 2007 10:25, Mike Meyer wrote: > > > > > > > > Well, InformixDB has a way that works. I can think of at least two > > > > other databases that it won't work for - or at least, will have > > > > surprising consequences for users. That's not the same thing as the > > > > paramstyle having a way. > > > Can you name those database systems? The DB-API should support parameters > > > without additional quoting. If, say, PostgreSQL doesn't support parameters > > > internally, the module should prevent the programmer from knowing about that. > > A number of open source db systems allow both single and double quotes > > for literals, which breaks the "not inside apostrophes" rule that was > > proposed. And from what's in the dbsig archives, informix uses ':' in > > a couple of different places. > Apparently I've confused you by simplifying the situation. No, you haven't confused me - you've just simplified things. I'm trying to get things *precise*, not simple. When you simplified, you lost precision. > The iron-clad, all-encompassing, golden rule is this: If something looks > like a parameter marker and occurs in a place where a parameter marker > is syntactically allowed, it's a parameter marker. Anything else isn't. Ok, I'll accept that that's iron-clad and all-encompassing. It's also circular. I suspect it also changes depending on the underlying database. > And in case that's not clear, parameter markers are not allowed inside > string literals or delimited column names. In every dbapi 2 module for every available database? Or just most of them? > The basic rule above still holds true, because you can't have a > parameter placeholder inside a datetime literal or in a table name. The > parser just uses a heuristic because writing a full SQL parser would be > insane. Let's see if I've got this: the exact rule is that you only do binding where a parameter is allowed. Figuring out where a parameter is allowed requires a large enough portion of a full SQL parser that writing one would be insane. So doing what I'm asking - providing a precise rule - would be insane. I think the key word is "insane". > > [...] > > > And where a question mark (or other parameter marker) appears outside a string > > > literal, it's unambiguously interpreted as a parameter marker since there > > > should be no other interpretation of that character outside literals. > > > > Except there are cases where that's not so. As previously mentioned, > > informix apparently uses ':' for a couple of things. And consider this: > > > > >>> c.execute("insert into 'FOO' ('ID') values ('hello')") > > > > > > This query has three string literals. Let's start replacing them... > > The fact that the query works with string literals for table names and > column names is mildly surprising. That must be an Sqlite peculiarity. > It's definitely not standard SQL. sqlite allows you to use either single and double quotes for string literals and identifier delimiters. > Also, I object to your use of the word "substitute", since it implies > that parameter binding is a string formatting exercise. In most database > engines, parameters are not substituted into the query text. Instead, > the parameter values are "bound" to the placeholders in the query and > transmitted to the database separately from the query. Fair enough - it is a lot closer to binding than string substitution. > > So these literal strings aren't values? Even though in some cases I'm > > forced to quote them (and the cases vary from db to db, and how you > > quote them varies, and .....)? > > > > Or maybe with cx_Oracle: > > > > >>> c.execute("""select count(*) from log_metric where "METRIC_STRING" = 'pages swapped out'""") > > [] > > >>> c.fetchall() > > [(189,)] > > >>> c.execute("""select count(*) from log_metric where :colname = 'pages swapped out'""", dict(colname='METRIC_STRING')) > > [] > > >>> c.fetchall() > > [(0,)] > > > > Hmm. Wrong answer. Let's try a different tack: > > Right answer, wrong question. In the first case, you're counting where > the contents of the column METRIC_STRING equal 'pages swapped out'. In > the second case, you're counting where the string 'METRIC_STRING' (bound > as a parameter value) equals 'pages swapped out'. Different question, > different answer. The answer isn't what I wanted, and hence wrong. The bug is in my code - it doesn't ask the question I wanted to ask. > To execute a query where table names and column names are variable, you > should use string formatting yourself to build the structure of the > query, placing parameter markers into the structure where necessary, and > then using parameter binding to supply the actual values. But isn't one of the arguments against pyformat/format is that they lead to people doing string formatting to build the query, which is a bad thing because, unless done very carefully, they leave you exposed to all kinds of data injection attacks? If the other styles wind up *requiring* you to build the query with string formatting, how can they possibly be considered superior? http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Sun Aug 12 03:45:21 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 11 Aug 2007 21:45:21 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070811183327.0f4b16bc@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186791857.3256.36.camel@localhost.localdomain> <20070811042517.5597d2bf@bhuda.mired.org> <200708111814.16924.paul@boddie.org.uk> <20070811153253.171663a7@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> Message-ID: <1186883121.3096.58.camel@localhost.localdomain> On Sat, 2007-08-11 at 18:33 -0400, Mike Meyer wrote: > On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese wrote: > > The iron-clad, all-encompassing, golden rule is this: If something looks > > like a parameter marker and occurs in a place where a parameter marker > > is syntactically allowed, it's a parameter marker. Anything else isn't. > > Ok, I'll accept that that's iron-clad and all-encompassing. It's also > circular. How so? > I suspect it also changes depending on the underlying > database. Not so much, especially if you restrict yourself to qmark-style (the SQL standard) for cross-database compatibility. No SQL compliant database will allow question marks outside of quotation marks or apostrophes for anything other than parameter markers. > > And in case that's not clear, parameter markers are not allowed inside > > string literals or delimited column names. > > In every dbapi 2 module for every available database? Or just most of > them? In any correct DB-API implementation for any SQL compliant database. In SQL's grammar, and are disjoint productions, so you'll never find one inside the other. Here are the pertinent productions in BNF: ::= | ::= | | | USER | CURRENT_USER | SESSION_USER | SYSTEM_USER | VALUE ::= > > The basic rule above still holds true, because you can't have a > > parameter placeholder inside a datetime literal or in a table name. The > > parser just uses a heuristic because writing a full SQL parser would be > > insane. > > Let's see if I've got this: the exact rule is that you only do binding > where a parameter is allowed. Figuring out where a parameter is > allowed requires a large enough portion of a full SQL parser that > writing one would be insane. So doing what I'm asking - providing a > precise rule - would be insane. I think the key word is "insane". You're mixing two different levels. One is the definition, one is the practical implementation. Implementing the definition to the letter is insane, which is why the practical implementation uses a heuristic that provides the same result. > > To execute a query where table names and column names are variable, you > > should use string formatting yourself to build the structure of the > > query, placing parameter markers into the structure where necessary, and > > then using parameter binding to supply the actual values. > > But isn't one of the arguments against pyformat/format is that they > lead to people doing string formatting to build the query, which is a > bad thing because, unless done very carefully, they leave you exposed > to all kinds of data injection attacks? If the other styles wind up > *requiring* you to build the query with string formatting, how can > they possibly be considered superior? It forces the application programmer to understand and appreciate the difference between USER-SUPPLIED VALUES and SYNTAX ELEMENTS in a database query. User-supplied values should *always* be provided via parameter binding. Syntax elements should *never* (and in standard SQL, can't) be provided via parameter binding. In real database applications, the syntax elements of queries vary rarely, and should never come from user input. That's why requiring the application developer to put together those parts with string formatting is acceptable. String formatting and parameter binding are two different tools for two different tasks, and that's why parameter styles that distinguish the two are superior. -- Carsten Haese http://informixdb.sourceforge.net From paul at boddie.org.uk Sun Aug 12 02:30:45 2007 From: paul at boddie.org.uk (Paul Boddie) Date: Sun, 12 Aug 2007 02:30:45 +0200 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070811183327.0f4b16bc@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> Message-ID: <200708120230.45958.paul@boddie.org.uk> On Sunday 12 August 2007 00:33, Mike Meyer wrote: > On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese wrote: > > > > The iron-clad, all-encompassing, golden rule is this: If something looks > > like a parameter marker and occurs in a place where a parameter marker > > is syntactically allowed, it's a parameter marker. Anything else isn't. > > Ok, I'll accept that that's iron-clad and all-encompassing. It's also > circular. I suspect it also changes depending on the underlying > database. How is the above circular? If we have something, for example "?", which looks like a parameter marker (because we've defined "?" to be a parameter marker), and it appears in a place where such parameter markers are syntactically allowed/possible/recognised (see the SQL specifications available on the Internet), then our software notes that "?" in that place is a parameter marker. Otherwise, the occurrence of "?" just forms part of some other sequence of characters and is not considered to be such a parameter marker by our software. > > And in case that's not clear, parameter markers are not allowed inside > > string literals or delimited column names. > > In every dbapi 2 module for every available database? Or just most of > them? By an appropriate choice of marker in conjunction with the standards. [...] > Let's see if I've got this: the exact rule is that you only do binding > where a parameter is allowed. Figuring out where a parameter is > allowed requires a large enough portion of a full SQL parser that > writing one would be insane. So doing what I'm asking - providing a > precise rule - would be insane. I think the key word is "insane". The standard doesn't permit parameter usage in various places because it is not a simple string substitution that occurs. So, the first party to follow the standard must be the application developer, although they may be lucky with some database systems and modules if misusing parameters and knowing that the module/database just does a substitution. However, a database module or library which converts from a standard marker to a non-standard one, for example, would need to know where to find the genuine occurrences of the standard marker. That's the job of something like sqlliterals, and you shouldn't need a full parser for this in many cases, thanks to the design of the SQL grammar. Really, the aim of software like sqlliterals would be to preserve the correctness of converted queries - not necessarily to forbid incorrect syntax - so one might attempt to use a parameter to substitute column names, and it would be the job of the database system to complain. [...] > > The fact that the query works with string literals for table names and > > column names is mildly surprising. That must be an Sqlite peculiarity. > > It's definitely not standard SQL. > > sqlite allows you to use either single and double quotes for string > literals and identifier delimiters. So it doesn't follow SQL-92, at least. [...] > > To execute a query where table names and column names are variable, you > > should use string formatting yourself to build the structure of the > > query, placing parameter markers into the structure where necessary, and > > then using parameter binding to supply the actual values. > > But isn't one of the arguments against pyformat/format is that they > lead to people doing string formatting to build the query, which is a > bad thing because, unless done very carefully, they leave you exposed > to all kinds of data injection attacks? If the other styles wind up > *requiring* you to build the query with string formatting, how can > they possibly be considered superior? Because for a lot of applications, people don't need to build up queries from parts, or if they do, they can do it without mixing in user-supplied values. Distinguishing parameter usage from string substitution reduces confusion for beginners who don't tend to do such advanced stuff, and it makes them more aware of the issues when they finally start doing so. Paul From mwm-keyword-dbsig.588a7d at mired.org Sun Aug 12 19:07:17 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sun, 12 Aug 2007 13:07:17 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <200708120230.45958.paul@boddie.org.uk> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> Message-ID: <20070812130717.0ac1987c@bhuda.mired.org> On Sun, 12 Aug 2007 02:30:45 +0200 Paul Boddie wrote: > On Sunday 12 August 2007 00:33, Mike Meyer wrote: > > On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese > wrote: > > > > > > The iron-clad, all-encompassing, golden rule is this: If something looks > > > like a parameter marker and occurs in a place where a parameter marker > > > is syntactically allowed, it's a parameter marker. Anything else isn't. > > > > Ok, I'll accept that that's iron-clad and all-encompassing. It's also > > circular. I suspect it also changes depending on the underlying > > database. > How is the above circular? If we have something, for example "?", which looks > like a parameter marker (because we've defined "?" to be a parameter marker), > and it appears in a place where such parameter markers are syntactically > allowed/possible/recognised (see the SQL specifications available on the > Internet), then our software notes that "?" in that place is a parameter > marker. Otherwise, the occurrence of "?" just forms part of some other > sequence of characters and is not considered to be such a parameter marker by > our software. How is "We only recognize parameter markers where we recognize parameter markers" *not* circular? > > > To execute a query where table names and column names are variable, you > > > should use string formatting yourself to build the structure of the > > > query, placing parameter markers into the structure where necessary, and > > > then using parameter binding to supply the actual values. > > > > But isn't one of the arguments against pyformat/format is that they > > lead to people doing string formatting to build the query, which is a > > bad thing because, unless done very carefully, they leave you exposed > > to all kinds of data injection attacks? If the other styles wind up > > *requiring* you to build the query with string formatting, how can > > they possibly be considered superior? > > Because for a lot of applications, people don't need to build up queries from > parts, or if they do, they can do it without mixing in user-supplied values. Sorry, not buying that one. "Most people don't need foo" isn't an argument as to why something that doesn't do foo might be consider superior to something that does. > Distinguishing parameter usage from string substitution reduces confusion for > beginners who don't tend to do such advanced stuff, and it makes them more > aware of the issues when they finally start doing so. I think you just pinpointed the problem: parameter substitution in dbapi is being advertised as the solution to a problem it's not really adequate to solve. IIUC, it's restricted by the underlying SQL implementation (and inherits portability problems from there as well). That some underlying SQL implementations may not support it at all further complicates things. So, in the spirit of having the module authors instead of the users do the work, how about taking this burden away from execute/executemany, and providing a tool that is adequate to the job. Here's a clean slate design proposal: execute's parameters are for access to the underlying SQL engines parameter mechanism. It takes a string and a dict or list (just as it does now). The string is passed to the database unchanged. executemany is related to execute the same way it is now. paramstyles is a list of strings indicating what parameters styles the underlying database supports. The currently recognizes styles are 'qmark', 'named' and 'numeric'. An empty list means the database doesn't support parameter substitution, and is a perfectly valid value. The module author doesn't have to provide a mechanism for doing this so people can build statements from untrusted data because we also provide: build_statement is a tool for safely creating SQL statements from untrusted data. The signature is build_statement(basestring, *args, **kwargs). Providing both *args and **kwargs is undefined. build_statement is a substitution mechanism, but not a simple string substitution. Instead, the substitution markers indicate where we substitute SQl tokens in the statement. Since tokens have types that can't be determined from the type of the value - in particular, a delimited identifier for a column name vs. a string literal in an expression - we have to have type information for the marker. So we're going to use the familiar %-notation to provide it. The possible type indicaters are: %s - produce a string literal. Values will be coerced to strings. %d,f,g - the usual numeric substitutions. %i - produce an identifier. Values must be strings. %t - produce a time literal from a datetime.datetime or None. None means to have the database substitute the current time when the statement is executed. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Sun Aug 12 23:05:44 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sun, 12 Aug 2007 17:05:44 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070812130717.0ac1987c@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> Message-ID: <1186952744.3092.13.camel@localhost.localdomain> On Sun, 2007-08-12 at 13:07 -0400, Mike Meyer wrote: > I think you just pinpointed the problem: parameter substitution in > dbapi is being advertised as the solution to a problem it's not really > adequate to solve. For the millionth time, it's parameter binding, not parameter substitution. And it solves exactly the problem it's designed to solve by the makers of the SQL standard: supplying variable VALUES for a query. > [snip proposal...] -1. The problem that your proposal is trying to solve doesn't exist. For supplying variable values, parameter binding as it is (with the addition of making qmark and named mandatory as was decided recently) is perfectly adequate. For supplying variable table names, column names, where clauses, and other syntax elements, string formatting seems perfectly adequate. Maybe you should illustrate the kinds of problems you're encountering in whatever it is you're doing that makes you feel that the existing way is inadequate. Right now, you're coming off somewhat trollish by making proposals for solving non-existent problems. -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Sun Aug 12 23:47:37 2007 From: carl at personnelware.com (Carl Karsten) Date: Sun, 12 Aug 2007 16:47:37 -0500 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070810161149.049743ea@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> Message-ID: <46BF7FF9.3020403@personnelware.com> > As a final note, on backwards compatibility - that was amusing. The > two sides were trying to define two things with one definition. Let me > try with two definitions: > > A module is backwards compatible if code that used the documented APIs > of the previous version of the module will work unchanged with the > current version. > > A specification is backwards compatible if it is possible for an > implementation of the current version to be backwards compatible with > an implementation of the previous version. > > I think both of these are desirable properties. Thanks, I was wondering why I was having trouble getting my head around the whole thing. I think I will give up my crusade to eradicate the 'extra' formats. I am not completely convinced it isn't a good idea, but I haven't been able to put together an example of how it would work, and maybe it won't be so bad after all. Carl K From mwm-keyword-dbsig.588a7d at mired.org Mon Aug 13 00:12:44 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Sun, 12 Aug 2007 18:12:44 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1186952744.3092.13.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> Message-ID: <20070812181244.395232c7@bhuda.mired.org> On Sun, 12 Aug 2007 17:05:44 -0400 Carsten Haese wrote: > -1. The problem that your proposal is trying to solve doesn't exist. For > supplying variable values, parameter binding as it is (with the addition > of making qmark and named mandatory as was decided recently) is > perfectly adequate. Maybe. Maybe not. Not at issue. > For supplying variable table names, column names, > where clauses, and other syntax elements, string formatting seems > perfectly adequate. It may seem adequate, but it isn't. Table/column names from external sources have to deal with the exact same set of data injection issues that values from external sources do. > Maybe you should illustrate the kinds of problems you're encountering in > whatever it is you're doing that makes you feel that the existing way is > inadequate. Right now, you're coming off somewhat trollish by making > proposals for solving non-existent problems. I didn't say the existing way is inadequate; it can be fine (except for portability issues). The proposed changes - basically including killing off the mechanisms that do work - are what aren't fine. You're right, in that the existing mechanisms *can* deal with the issues. However, two of the points that comes up over and over again here is "use parameters, don't build the query strings yourself" and "we would rather the module authors do the work than the users". I'm trying to figure out how *either* of those is miscible with "Just use pythons string substitutions for table/column names", much less *both* of them. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Mon Aug 13 03:51:33 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sun, 12 Aug 2007 21:51:33 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070812181244.395232c7@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> Message-ID: <1186969893.3155.37.camel@localhost.localdomain> On Sun, 2007-08-12 at 18:12 -0400, Mike Meyer wrote: > You're right, in that the existing mechanisms *can* deal with the > issues. However, two of the points that comes up over and over again > here is "use parameters, don't build the query strings yourself" and > "we would rather the module authors do the work than the users". I'm > trying to figure out how *either* of those is miscible with "Just use > pythons string substitutions for table/column names", much less *both* > of them. You're right, not having a cross-database mechanism for building queries out of variable parts does not quite jibe with those two principles. Keep in mind, however, that of the two problems of filling variable values into a query and filling variable table/column names into a query are two very different problems, and in most applications, the first one is more common than the second one by a factor of about a million to one or so. Let's say hypothetically we add something to the DB-API for plugging table names and column names into a query. Older versions of Informix restrict identifiers to 18 characters. What is the DB-API supposed to do if you try to plug in a table name that's longer? Truncate the name? Raise an exception? Something else? Even if we find a solution that fits every use case, simply having a mechanism for plugging table and column names into a query is not enough for a general framework for writing cross-database applications. What about syntax differences in "limit" queries? What about different names for built-in functions, etc. Any solution to this problem will either be incomplete or such a behemoth that it will be next to impossible to implement a compliant API module. The sheer size of the problem of database abstraction and the fact that there is no one solution that fits all is the reason why there are many different solutions such as SQLObject, SQLAlchemy, Dabo, Django, etc already in the wild. It's also the reason why I doubt that DB-API is going to grow such a query construction toolkit layer any time soon. -- Carsten Haese http://informixdb.sourceforge.net From paul at boddie.org.uk Sun Aug 12 20:34:46 2007 From: paul at boddie.org.uk (Paul Boddie) Date: Sun, 12 Aug 2007 20:34:46 +0200 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070812130717.0ac1987c@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> Message-ID: <200708122034.47076.paul@boddie.org.uk> On Sunday 12 August 2007 19:07, Mike Meyer wrote: > > How is "We only recognize parameter markers where we recognize > parameter markers" *not* circular? The SQL specifications dictate where parameter markers can be used. Please search for "SQL-92" and examine the specifications document for further details. It should be noted that what I have proposed previously is merely the introduction of the standard parameter marker as adopted by other major standards such as ODBC and JDBC. [...] > I think you just pinpointed the problem: parameter substitution in > dbapi is being advertised as the solution to a problem it's not really > adequate to solve. IIUC, it's restricted by the underlying SQL > implementation (and inherits portability problems from there as > well). That some underlying SQL implementations may not support it at > all further complicates things. PEP 249 (DB-API 2.0) refers to binding parameter values, not substituting values into the query, even if the latter does happen with some systems (after some modification of the values). So, no-one is really advertising parameters in the way you've described. [...] I agree that for some kinds of applications, there's a need for improved query building tools, especially for those of us who don't really see the attraction of object-relational mapping technology. Meanwhile, we still need a better way of dealing with parameters. Having a database module tell me at runtime that its paramstyle is "xyz" is not particularly useful if I've already written my queries, and having lots of different modules with different paramstyles makes for a tedious exercise in providing queries for them all. And the pyformat style is just an anachronism providing a false convenience for the dubious benefit of module writers. Paul From carsten at uniqsys.com Mon Aug 13 14:16:28 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Mon, 13 Aug 2007 08:16:28 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <200708122034.47076.paul@boddie.org.uk> References: <20070810161149.049743ea@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <200708122034.47076.paul@boddie.org.uk> Message-ID: <1187007388.3376.8.camel@dot.uniqsys.com> On Sun, 2007-08-12 at 20:34 +0200, Paul Boddie wrote: > Meanwhile, we still need a better way of dealing with parameters. Having a > database module tell me at runtime that its paramstyle is "xyz" is not > particularly useful if I've already written my queries True. That's why we decided not too long ago to make qmark and named mandatory in the next version of DB-API. -- Carsten Haese http://informixdb.sourceforge.net From paul at boddie.org.uk Mon Aug 13 20:57:12 2007 From: paul at boddie.org.uk (Paul Boddie) Date: Mon, 13 Aug 2007 20:57:12 +0200 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1187007388.3376.8.camel@dot.uniqsys.com> References: <20070810161149.049743ea@bhuda.mired.org> <200708122034.47076.paul@boddie.org.uk> <1187007388.3376.8.camel@dot.uniqsys.com> Message-ID: <200708132057.12257.paul@boddie.org.uk> On Monday 13 August 2007 14:16, Carsten Haese wrote: > On Sun, 2007-08-12 at 20:34 +0200, Paul Boddie wrote: > > Meanwhile, we still need a better way of dealing with parameters. Having > > a database module tell me at runtime that its paramstyle is "xyz" is not > > particularly useful if I've already written my queries > > True. That's why we decided not too long ago to make qmark and named > mandatory in the next version of DB-API. It isn't me that needs convincing. ;-) I've uploaded a new version of sqlliterals which has a replace function for the conversion of favourite paramstyles to "legacy" paramstyles in SQL statements: http://www.python.org/pypi/sqlliterals Paul From aprotin at research.att.com Mon Aug 13 22:55:34 2007 From: aprotin at research.att.com (Art Protin) Date: Mon, 13 Aug 2007 16:55:34 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070812181244.395232c7@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> Message-ID: <46C0C546.3070705@research.att.com> Dear folks, Mike Meyer wrote: >On Sun, 12 Aug 2007 17:05:44 -0400 Carsten Haese wrote: > > > >>-1. The problem that your proposal is trying to solve doesn't exist. For >>supplying variable values, parameter binding as it is (with the addition >>of making qmark and named mandatory as was decided recently) is >>perfectly adequate. >> >> > >Maybe. Maybe not. Not at issue. > > > >>For supplying variable table names, column names, >>where clauses, and other syntax elements, string formatting seems >>perfectly adequate. >> >> > >It may seem adequate, but it isn't. Table/column names from external >sources have to deal with the exact same set of data injection issues >that values from external sources do. > > > It is a mistake to say "the exact same set of data injection issues" for while they are data injection issues, they are not exactly the same. Table names are a different type than are column names which are different from strings or integers. A valid string is any string of characters shorter than some maximum that uses characters from some acceptable character set. Processing a string parameter involves simply checking the length and character set. Processing an integer involves simply checking sign and size constraints. A column name is much more constrained than that, no only must its characters be from a more limited alphabet, the name is not case sensitive, and the name must, depending on the operation, either exist or not exist within a specified table definition. Further, it is generally (but not universally) recognized that while SELECT * FROM SUPPLIER WHERE City = 'St. Paul' and SELECT * FROM SUPPLIER WHERE City = 'New York' are structurally the same query looking at the same fields of the same table and doing the same things with them, the query SELECT * FROM AIRPORT WHERE City = 'St. Paul' has only a very superficial similarity and next to none of the operations to answer the first two would be applicable to the third. The difference with the third include: the number of columns involved, the names of the columns involved, the types of the columns involved, the indices for the column City (there may be more than one), the storage location of the table (which files, which partitions, which disks). In short, just about everything. >>Maybe you should illustrate the kinds of problems you're encountering in >>whatever it is you're doing that makes you feel that the existing way is >>inadequate. Right now, you're coming off somewhat trollish by making >>proposals for solving non-existent problems. >> >> > >I didn't say the existing way is inadequate; it can be fine (except >for portability issues). The proposed changes - basically including >killing off the mechanisms that do work - are what aren't fine. > >You're right, in that the existing mechanisms *can* deal with the >issues. However, two of the points that comes up over and over again >here is "use parameters, don't build the query strings yourself" and >"we would rather the module authors do the work than the users". I'm >trying to figure out how *either* of those is miscible with "Just use >pythons string substitutions for table/column names", much less *both* >of them. > > > OK, you win this detail. The advice is misstated, using common assumptions. You should use parameters for everything that parameters will work for, namely, as a standin for literals, for values that when changed do not structurally alter the query. While I vaguely recollect doing an application that did construct queries, using data from the GUI to select tables and columns, I have only done that once and do not remember why that seemed appropriate at the time. If this were a more common practice, we would be further along in defining how to do it in a "standard" way. Sorry. Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070813/a12d7122/attachment.htm From mwm at mired.org Mon Aug 13 16:37:49 2007 From: mwm at mired.org (Mike Meyer) Date: Mon, 13 Aug 2007 10:37:49 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1186969893.3155.37.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> Message-ID: <20070813103749.09f6e19f@mbook.mired.org> On Sun, 12 Aug 2007 21:51:33 -0400 Carsten Haese wrote: > On Sun, 2007-08-12 at 18:12 -0400, Mike Meyer wrote: > > You're right, in that the existing mechanisms *can* deal with the > > issues. However, two of the points that comes up over and over again > > here is "use parameters, don't build the query strings yourself" and > > "we would rather the module authors do the work than the users". I'm > > trying to figure out how *either* of those is miscible with "Just use > > pythons string substitutions for table/column names", much less *both* > > of them. > You're right, not having a cross-database mechanism for building queries > out of variable parts does not quite jibe with those two principles. > Keep in mind, however, that of the two problems of filling variable > values into a query and filling variable table/column names into a query > are two very different problems, and in most applications, the first one > is more common than the second one by a factor of about a million to one > or so. While I think your order is a little exaggerated, I'll merely point out that it's a common thing to see when you're writing code that writes code. SQL pretty much sucks for this, but Python isn't to bad - and it's one of the most powerful programming technics available - I seem to use it in every other application. So I'd expect it to become more common, not less. > Even if we find a solution that fits every use case, simply having a > mechanism for plugging table and column names into a query is not enough > for a general framework for writing cross-database applications. What > about syntax differences in "limit" queries? What about different names > for built-in functions, etc. Any solution to this problem will either be > incomplete or such a behemoth that it will be next to impossible to > implement a compliant API module. You're right - we can't provide a perfect solution. Fortunately, we don't have to, as the goal is progress, not perfection. That being the case is why there was a dbapi, and a second version, and now a proposal for a third version with progress along two lines: 1) requiring qmark. This will be easy for most modules, since an SQL engine that supports the standard will already have it, so this is making module authors paper over a hole in the underlying SQL engine they support. 2) requiring named. This isn't a standard, but is in wide use because it provides a desperately needed bit of functionality. So this is making module authors paper over a hole in the SQL standards. Note that these two solutions aren't complete abstractions by any means; they depend on the vagaries of the SQL implementation and/or the module authors. What they are is a portable way to access the parameter mechanism of the underlying database, or an emulation of that if it doesn't exist. Adding a real tool to let code write queries - as opposed to abusing parameter binding for the job - is another case of making module authors paper over a hole in the SQL standards. Since the examples of qmark and named show that we're not requiring perfect portability, but merely progress, we can look at your question (I've moved it) with that in mind. > Let's say hypothetically we add something to the DB-API for plugging > table names and column names into a query. Older versions of Informix > restrict identifiers to 18 characters. What is the DB-API supposed to do > if you try to plug in a table name that's longer? Truncate the name? > Raise an exception? Something else? Oracle has a similar - though not quite so unreasonable - restriction. I believe it's still in place. All of the above seem to be reasonable choices for this case. But what does DB-API say happens if I use such an identifier in code that's written by hand instead of by other code? I can't find any reference to this. Why should the DB-API spec have to deal with what are essentially errors in the SQL for SQL written by code differently than it does code written by hand? > The sheer size of the problem of database abstraction and the fact that > there is no one solution that fits all is the reason why there are many > different solutions such as SQLObject, SQLAlchemy, Dabo, Django, etc > already in the wild. It's also the reason why I doubt that DB-API is > going to grow such a query construction toolkit layer any time soon. I'm not asking for a complete abstraction, or any kind of abstraction. I'm asking for a portable way to apply one of the most powerful tools in the programmers toolbox to SQL. Sure, the resulting SQL might not be portable - that can't be helped. But at least I don't have to tweak the Python every time I change databases or modules, just the SQL. That's progress - and that's the goal. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carl at personnelware.com Tue Aug 14 02:11:15 2007 From: carl at personnelware.com (Carl Karsten) Date: Mon, 13 Aug 2007 19:11:15 -0500 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070813103749.09f6e19f@mbook.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> Message-ID: <46C0F323.3090303@personnelware.com> Mike Meyer wrote: > On Sun, 12 Aug 2007 21:51:33 -0400 Carsten Haese wrote: >> On Sun, 2007-08-12 at 18:12 -0400, Mike Meyer wrote: >>> You're right, in that the existing mechanisms *can* deal with the >>> issues. However, two of the points that comes up over and over again >>> here is "use parameters, don't build the query strings yourself" and >>> "we would rather the module authors do the work than the users". I'm >>> trying to figure out how *either* of those is miscible with "Just use >>> pythons string substitutions for table/column names", much less *both* >>> of them. >> You're right, not having a cross-database mechanism for building queries >> out of variable parts does not quite jibe with those two principles. >> Keep in mind, however, that of the two problems of filling variable >> values into a query and filling variable table/column names into a query >> are two very different problems, and in most applications, the first one >> is more common than the second one by a factor of about a million to one >> or so. > > While I think your order is a little exaggerated, I'll merely point > out that it's a common thing to see when you're writing code that > writes code. SQL pretty much sucks for this, but Python isn't to bad - > and it's one of the most powerful programming technics available - I > seem to use it in every other application. So I'd expect it to become > more common, not less. about a million to one seems realistic to me. How often does an identifier come from an untrusted source? The only place I really see a use for this is exposing an API to a 2nd developer(s) that is untrusted. isn't it easy enough to validate the identifier against a list pulled from the DB? (as in, you can get a list of valid column names, so any supplied column name has to be in that list.) unless you are allowing ALTER TABLE. hmm... still seems million to one. could be because I have never heard of such a thing, and even have trouble imagining it, so to me it is more like 100 to 0. Carl K From mwm-keyword-dbsig.588a7d at mired.org Tue Aug 14 02:45:54 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Mon, 13 Aug 2007 20:45:54 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C0C546.3070705@research.att.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <46C0C546.3070705@research.att.com> Message-ID: <20070813204554.7fc71981@bhuda.mired.org> On Mon, 13 Aug 2007 16:55:34 -0400 Art Protin wrote: > >It may seem adequate, but it isn't. Table/column names from external > >sources have to deal with the exact same set of data injection issues > >that values from external sources do. > It is a mistake to say "the exact same set of data injection issues" for > while they > are data injection issues, they are not exactly the same. > Table names are a different type than are column names > which are different from strings or integers. > A valid string is any string of characters shorter than some maximum > that uses characters from some acceptable character set. > Processing a string parameter involves simply checking the > length and character set. > Processing an integer involves simply checking sign and size constraints. > > A column name is much more constrained than that, > no only must its characters be from a more limited alphabet, > the name is not case sensitive, and the name must, depending on the > operation, > either exist or not exist within a specified table definition. You can use delimited identifiers for column and table names in most SQL dialects. If you do that, pretty much everything you just said is false. > While I vaguely recollect doing an application that did construct queries, > using data from the GUI to select tables and columns, I have only done that > once and do not remember why that seemed appropriate at the time. In this case, you control the set of column and table names, so checking them for validity is trivial - you make sure the one the user gave you is in the set you expect (presumably using the same list you used to generate whatever the user selected from), and you're done. The real issue is when you're trying to build a database where the user - not you - control the table and column names (because, for instance, you're interface with some non-SQL data storage and want to reuse their names - data on pdas and on in systems monitoring tools are the ones I've run into). SQL makes that possible via the delimited identifier mechanism. And this is where that set of data injection problems arises. > If this were a more common practice, we would be further along in defining > how to do it in a "standard" way. Sorry. So is there any chance of getting that written up as something to appear in dbapi 3? I've already written one proposal, and would be glad to create more.... http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From mwm-keyword-dbsig.588a7d at mired.org Tue Aug 14 16:18:17 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Tue, 14 Aug 2007 10:18:17 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C0F323.3090303@personnelware.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> Message-ID: <20070814101817.685ba5b0@bhuda.mired.org> On Mon, 13 Aug 2007 19:11:15 -0500 Carl Karsten wrote: > Mike Meyer wrote: > > While I think your order is a little exaggerated, I'll merely point > > out that it's a common thing to see when you're writing code that > > writes code. SQL pretty much sucks for this, but Python isn't to bad - > > and it's one of the most powerful programming technics available - I > > seem to use it in every other application. So I'd expect it to become > > more common, not less. > about a million to one seems realistic to me. In my experience, its more like every other application that needs this. > How often does an identifier come from an untrusted source? Um, how about in every web-based app that has a real search facility? One that lets the user specify which column(s) they want to check, or that can search multiple tables? I seem to be involved in working on one of those every few years: an SGML document search engine, a user database search engine, a webmail client, a workflow management system, and a software change tracking system are what I can recall now. > hmm... still seems million to one. could be because I > have never heard of such a thing, and even have trouble imagining it, so to me > it is more like 100 to 0. Um, SQL has no facility to do this kind of thing, and the python apis in general have no support for it, so this sort of reminds me of a C programmer saying that objects don't have any use. Of course, technically you're both right. Anything I can do with objects, I can do without them. And when I start looking at how to make the identifies not come from an untrusted source, I can see ways to do it. However, they all involve me doing lots of copy-n-paste "development", or adding tables to deal with indirection to my queries and presumably slowing them down - or both. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carl at personnelware.com Tue Aug 14 19:27:19 2007 From: carl at personnelware.com (Carl Karsten) Date: Tue, 14 Aug 2007 12:27:19 -0500 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070814101817.685ba5b0@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> Message-ID: <46C1E5F7.40609@personnelware.com> Mike Meyer wrote: > On Mon, 13 Aug 2007 19:11:15 -0500 Carl Karsten wrote: >> Mike Meyer wrote: >>> While I think your order is a little exaggerated, I'll merely point >>> out that it's a common thing to see when you're writing code that >>> writes code. SQL pretty much sucks for this, but Python isn't to bad - >>> and it's one of the most powerful programming technics available - I >>> seem to use it in every other application. So I'd expect it to become >>> more common, not less. >> about a million to one seems realistic to me. > > In my experience, its more like every other application that needs > this. > >> How often does an identifier come from an untrusted source? > > Um, how about in every web-based app that has a real search facility? > One that lets the user specify which column(s) they want to check, or > that can search multiple tables? I seem to be involved in working on > one of those every few years: an SGML document search engine, a user > database search engine, a webmail client, a workflow management > system, and a software change tracking system are what I can recall > now. hmm, I think I see it. Even if you provide a list of valid identifiers to the browser, there is nothing to prevent that being replaced. Got the URL of one of these so I an examine it? Carl K From mwm-keyword-dbsig.588a7d at mired.org Tue Aug 14 20:10:46 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Tue, 14 Aug 2007 14:10:46 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C1E5F7.40609@personnelware.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <46C1E5F7.40609@personnelware.com> Message-ID: <20070814141046.10aaf8ad@bhuda.mired.org> On Tue, 14 Aug 2007 12:27:19 -0500 Carl Karsten wrote: > >> How often does an identifier come from an untrusted source? > > > > Um, how about in every web-based app that has a real search facility? > > One that lets the user specify which column(s) they want to check, or > > that can search multiple tables? I seem to be involved in working on > > one of those every few years: an SGML document search engine, a user > > database search engine, a webmail client, a workflow management > > system, and a software change tracking system are what I can recall > > now. > > hmm, I think I see it. Even if you provide a list of valid identifiers to the > browser, there is nothing to prevent that being replaced. Exactly. In this case, it's fairly straightforward to check that the identifier is valid, but that's not always been the case for me. > Got the URL of one of these so I an examine it? None of the ones I've worked on that are still up are accessible to the public. However, buzilla is a typical example of this type of interface (built on top of mysql): https://bugzilla.mozilla.org/query.cgi Even better, the source is available. I haven't checked it to see if the HTTP query includes column names or not, though. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carsten at uniqsys.com Wed Aug 15 04:07:34 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 14 Aug 2007 22:07:34 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070814101817.685ba5b0@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> Message-ID: <1187143654.3906.36.camel@localhost.localdomain> On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: > > How often does an identifier come from an untrusted source? > > Um, how about in every web-based app that has a real search facility? > One that lets the user specify which column(s) they want to check, or > that can search multiple tables? Even if you take an identifier directly from an untrusted source, nobody is forcing you to stick it into a query unchecked. Anyway, I don't doubt that you often need to put unchecked identifiers from an untrusted source into queries, but I think you're in a very small minority compared to the general population of database application developers. I don't think that the DB-API spec should be weighed down by requiring a feature of such little general use, but you're welcome to write a reusable toolkit module that lives outside of and on top of DB-API. Of course you'll need to code some per-database logic that defines whether the database accepts delimited identifiers and what the delimiter is, but you only need to do this once for every database you plan on supporting. Keep in mind that this is just my opinion, and I don't speak for the entire DB-SIG community. It's your right to post a proposal and ask for a vote. -- Carsten Haese http://informixdb.sourceforge.net From mwm-keyword-dbsig.588a7d at mired.org Wed Aug 15 06:04:37 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Wed, 15 Aug 2007 00:04:37 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1187143654.3906.36.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> Message-ID: <20070815000437.4baaf469@bhuda.mired.org> On Tue, 14 Aug 2007 22:07:34 -0400 Carsten Haese wrote: > weighed down by requiring a feature of such little general use, but > you're welcome to write a reusable toolkit module that lives outside of > and on top of DB-API. Of course you'll need to code some per-database > logic that defines whether the database accepts delimited identifiers > and what the delimiter is, but you only need to do this once for every > database you plan on supporting. I'm well aware of that - and it's the tack I've taken, because - well, there's no better choice. However, this is just another toolkit on top of dbapi, so like most such, it won't work with many of them, so it'll be of interest to an even smaller set of users than something in dbapi. For the ones it does work for, it'll almost certainly have bugs that the developers of dbapi modules - presumably much more familiar with the db's in question than I am - would have caught (just one of the reasons supporting the argument for having module authors do things rather than users: the former group is much more likely to get it right). If there's not enough interest in this problem to generate interest here, there's certainly not enough for me to spend time on getting my clients to let me release it. > Keep in mind that this is just my opinion, and I don't speak for the > entire DB-SIG community. It's your right to post a proposal and ask for > a vote. I posted a proposal. The total vote was -1.... http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From mal at egenix.com Wed Aug 15 10:30:38 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 Aug 2007 10:30:38 +0200 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070815000437.4baaf469@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <20070815000437.4baaf469@bhuda.mired.org> Message-ID: <46C2B9AE.3000603@egenix.com> On 2007-08-15 06:04, Mike Meyer wrote: >> Keep in mind that this is just my opinion, and I don't speak for the >> entire DB-SIG community. It's your right to post a proposal and ask for >> a vote. > > I posted a proposal. The total vote was -1.... You can add another -1. We've been through these discussions many times before, so in theory you'd have to add all those votes as well :-) The whole purpose of having bind parameters is that the database can build a query/execution plan without having to know the values you intend to use in the query beforehand. This results in a massive speedup compared to having to reparse and build the query/execution plan for every single combination of parameter values. Another nice side-effect is the prevention of SQL injection attacks due to the fact that SQL commands and values are separated, much like code and data is separated by todays OSes for applications. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 15 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From aprotin at research.att.com Wed Aug 15 15:44:56 2007 From: aprotin at research.att.com (Art Protin) Date: Wed, 15 Aug 2007 09:44:56 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <1187143654.3906.36.camel@localhost.localdomain> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> Message-ID: <46C30358.2070306@research.att.com> Dear folks, Carsten Haese wrote: >On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: > > >>>How often does an identifier come from an untrusted source? >>> >>> >>Um, how about in every web-based app that has a real search facility? >>One that lets the user specify which column(s) they want to check, or >>that can search multiple tables? >> >> > >Even if you take an identifier directly from an untrusted source, nobody >is forcing you to stick it into a query unchecked. > > The better question is why is anybody letting him. It is the worst form of programming to use unchecked data. It is garbage like that that make so many systems insecure and the Net into a mass of botnets. So is he arguing that he needs tools to check & validate the values before using them as table or column names? >Anyway, I don't doubt that you often need to put unchecked identifiers >from an untrusted source into queries, but I think you're in a very >small minority compared to the general population of database >application developers. I don't think that the DB-API spec should be >weighed down by requiring a feature of such little general use, but >you're welcome to write a reusable toolkit module that lives outside of >and on top of DB-API. Of course you'll need to code some per-database >logic that defines whether the database accepts delimited identifiers >and what the delimiter is, but you only need to do this once for every >database you plan on supporting. > >Keep in mind that this is just my opinion, and I don't speak for the >entire DB-SIG community. It's your right to post a proposal and ask for >a vote. > > > Sorry, if I seem a little harsh, but this is apparently yet another of my hot buttons. Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070815/ff6fcc50/attachment.html From mwm-keyword-dbsig.588a7d at mired.org Wed Aug 15 16:38:26 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Wed, 15 Aug 2007 10:38:26 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C2B9AE.3000603@egenix.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <20070815000437.4baaf469@bhuda.mired.org> <46C2B9AE.3000603@egenix.com> Message-ID: <20070815103826.4494404e@bhuda.mired.org> On Wed, 15 Aug 2007 10:30:38 +0200 "M.-A. Lemburg" wrote: > On 2007-08-15 06:04, Mike Meyer wrote: > >> Keep in mind that this is just my opinion, and I don't speak for the > >> entire DB-SIG community. It's your right to post a proposal and ask for > >> a vote. > > > > I posted a proposal. The total vote was -1.... > > You can add another -1. We've been through these discussions many > times before, so in theory you'd have to add all those votes > as well :-) Just to be clear, this proposal was for adding functionality to the dbapi to help people who need to build queries that include untrusted data that isn't values. It was unrelated to parameter binding. I didn't see any previous discussion of such. Can you provide pointer(s)? > Another nice side-effect is the prevention of SQL injection attacks > due to the fact that SQL commands and values are separated, much > like code and data is separated by todays OSes for applications. Never forget that one person's code is someone else's data. The python interpreter is code to us and data to the CPU; our python code is the interpreters data; and our string data is SQL code. The problem I'm trying to deal with comes about when the code is built from data that's not necessarily trustworthy. The design goals espoused in the discussion of parameter binding - most notably "It's better for the module authors than user to write the code" and "don't build queries with string substitution" would seem to indicate that this should be done by the db modules, but the dbapi has no tools for doing so. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From mwm-keyword-dbsig.588a7d at mired.org Wed Aug 15 16:55:38 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Wed, 15 Aug 2007 10:55:38 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C30358.2070306@research.att.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <46C30358.2070306@research.att.com> Message-ID: <20070815105538.6109e40e@bhuda.mired.org> On Wed, 15 Aug 2007 09:44:56 -0400 Art Protin wrote: > Carsten Haese wrote: > >On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: > >>>How often does an identifier come from an untrusted source? > >>Um, how about in every web-based app that has a real search facility? > >>One that lets the user specify which column(s) they want to check, or > >>that can search multiple tables? > >Even if you take an identifier directly from an untrusted source, nobody > >is forcing you to stick it into a query unchecked. > The better question is why is anybody letting him. > It is the worst form of programming to use unchecked data. > So is he arguing that he needs tools to check & validate the values before > using them as table or column names? Not quite. I'm asking for a tool that will safely insert identifiers from an untrusted source into a query, much the same way that parameter binding lets me insert values from an untrusted source. thanks, http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carl at personnelware.com Wed Aug 15 19:35:55 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 15 Aug 2007 12:35:55 -0500 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070815105538.6109e40e@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <46C30358.2070306@research.att.com> <20070815105538.6109e40e@bhuda.mired.org> Message-ID: <46C3397B.9030509@personnelware.com> Mike Meyer wrote: > On Wed, 15 Aug 2007 09:44:56 -0400 Art Protin wrote: >> Carsten Haese wrote: >>> On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: >>>>> How often does an identifier come from an untrusted source? >>>> Um, how about in every web-based app that has a real search facility? >>>> One that lets the user specify which column(s) they want to check, or >>>> that can search multiple tables? >>> Even if you take an identifier directly from an untrusted source, nobody >>> is forcing you to stick it into a query unchecked. > >> The better question is why is anybody letting him. >> It is the worst form of programming to use unchecked data. >> So is he arguing that he needs tools to check & validate the values before >> using them as table or column names? > > Not quite. I'm asking for a tool that will safely insert identifiers > from an untrusted source into a query, much the same way that > parameter binding lets me insert values from an untrusted source. > I would like to point out a big difference between the two: parameters are a feature of the db engine's API that has to be dealt with in the python dbapi module in order to be used. validating identifier names does not require anything in dbapi. This distinction may be a reason against adding additional functionality into dbapi. Carl K From mwm-keyword-dbsig.588a7d at mired.org Wed Aug 15 21:52:03 2007 From: mwm-keyword-dbsig.588a7d at mired.org (Mike Meyer) Date: Wed, 15 Aug 2007 15:52:03 -0400 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <46C3397B.9030509@personnelware.com> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <46C30358.2070306@research.att.com> <20070815105538.6109e40e@bhuda.mired.org> <46C3397B.9030509@personnelware.com> Message-ID: <20070815155203.78b7ced7@bhuda.mired.org> On Wed, 15 Aug 2007 12:35:55 -0500 Carl Karsten wrote: > Mike Meyer wrote: > > On Wed, 15 Aug 2007 09:44:56 -0400 Art Protin wrote: > >> Carsten Haese wrote: > >>> On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: > >>>>> How often does an identifier come from an untrusted source? > >>>> Um, how about in every web-based app that has a real search facility? > >>>> One that lets the user specify which column(s) they want to check, or > >>>> that can search multiple tables? > >>> Even if you take an identifier directly from an untrusted source, nobody > >>> is forcing you to stick it into a query unchecked. > > > >> The better question is why is anybody letting him. > >> It is the worst form of programming to use unchecked data. > >> So is he arguing that he needs tools to check & validate the values before > >> using them as table or column names? > > > > Not quite. I'm asking for a tool that will safely insert identifiers > > from an untrusted source into a query, much the same way that > > parameter binding lets me insert values from an untrusted source. > > > > I would like to point out a big difference between the two: parameters are a > feature of the db engine's API that has to be dealt with in the python dbapi > module in order to be used. validating identifier names does not require > anything in dbapi. This distinction may be a reason against adding additional > functionality into dbapi. Major nit: I didn't say "validate identifier names", I said "safely insert identifiers". To me (and my opinion is the one that matters, 'cause it's my statement :-) there's a big difference. You can "safely insert identifiers" by creating a delimited identifier that can be mapped back to the given string, no matter what's in it. There's no implication that the identifier is legal for the database at hand, just that it'll be parsed as an identifier by the underlying database (though modules should be free to do more). "Validate identifier names", on the other hand, implies that you'll tell me whether or not the identifier is a legal identifier in context - that it doesn't validate any constraints the underlying SQL engine has, and refers to an entity that actually exists, etc. If that's not what you meant, I'm sorry. This matches my understanding - and experience - with parameter binding. If I pass in a parameter that's nominally of the correct type but violates a database constraint (either formally, or on the type of the column), I don't expect dbapi to "fix" it, I expect an exception representing an error from the underlying SQL engine. And yes, dbapi has to have parameter binding to access that facility in some databases. But it doesn't for others - because they don't have parameter binding. But the dbapi spec requires it anyway, and for good reason. I believe those same reasons apply in this case. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From carl at personnelware.com Thu Aug 16 14:48:36 2007 From: carl at personnelware.com (Carl Karsten) Date: Thu, 16 Aug 2007 07:48:36 -0500 Subject: [DB-SIG] In praise of pyformat In-Reply-To: <20070815155203.78b7ced7@bhuda.mired.org> References: <20070810161149.049743ea@bhuda.mired.org> <1186866634.3318.48.camel@localhost.localdomain> <20070811183327.0f4b16bc@bhuda.mired.org> <200708120230.45958.paul@boddie.org.uk> <20070812130717.0ac1987c@bhuda.mired.org> <1186952744.3092.13.camel@localhost.localdomain> <20070812181244.395232c7@bhuda.mired.org> <1186969893.3155.37.camel@localhost.localdomain> <20070813103749.09f6e19f@mbook.mired.org> <46C0F323.3090303@personnelware.com> <20070814101817.685ba5b0@bhuda.mired.org> <1187143654.3906.36.camel@localhost.localdomain> <46C30358.2070306@research.att.com> <20070815105538.6109e40e@bhuda.mired.org> <46C3397B.9030509@personnelware.com> <20070815155203.78b7ced7@bhuda.mired.org> Message-ID: <46C447A4.4060005@personnelware.com> Mike Meyer wrote: > On Wed, 15 Aug 2007 12:35:55 -0500 Carl Karsten wrote: > >> Mike Meyer wrote: >>> On Wed, 15 Aug 2007 09:44:56 -0400 Art Protin wrote: >>>> Carsten Haese wrote: >>>>> On Tue, 2007-08-14 at 10:18 -0400, Mike Meyer wrote: >>>>>>> How often does an identifier come from an untrusted source? >>>>>> Um, how about in every web-based app that has a real search facility? >>>>>> One that lets the user specify which column(s) they want to check, or >>>>>> that can search multiple tables? >>>>> Even if you take an identifier directly from an untrusted source, nobody >>>>> is forcing you to stick it into a query unchecked. >>>> The better question is why is anybody letting him. >>>> It is the worst form of programming to use unchecked data. >>>> So is he arguing that he needs tools to check & validate the values before >>>> using them as table or column names? >>> Not quite. I'm asking for a tool that will safely insert identifiers >>> from an untrusted source into a query, much the same way that >>> parameter binding lets me insert values from an untrusted source. >>> >> I would like to point out a big difference between the two: parameters are a >> feature of the db engine's API that has to be dealt with in the python dbapi >> module in order to be used. validating identifier names does not require >> anything in dbapi. This distinction may be a reason against adding additional >> functionality into dbapi. > > Major nit: I didn't say "validate identifier names", I said "safely > insert identifiers". To me (and my opinion is the one that matters, > 'cause it's my statement :-) there's a big difference. You can "safely > insert identifiers" by creating a delimited identifier that can be > mapped back to the given string, no matter what's in it. There's no > implication that the identifier is legal for the database at hand, > just that it'll be parsed as an identifier by the underlying database > (though modules should be free to do more). "Validate identifier > names", on the other hand, implies that you'll tell me whether or not > the identifier is a legal identifier in context - that it doesn't > validate any constraints the underlying SQL engine has, and refers to > an entity that actually exists, etc. If that's not what you meant, I'm > sorry. > > This matches my understanding - and experience - with parameter > binding. If I pass in a parameter that's nominally of the correct type > but violates a database constraint (either formally, or on the type of > the column), I don't expect dbapi to "fix" it, I expect an exception > representing an error from the underlying SQL engine. > > And yes, dbapi has to have parameter binding to access that facility > in some databases. But it doesn't for others - because they don't have > parameter binding. But the dbapi spec requires it anyway, and for good > reason. I believe those same reasons apply in this case. > Even with your correction, it is still something that is not a feature of any db engine's API that has to be dealt with in the python dbapi module in order to be used. and I don't see why it matters that some db's dont support parameters. btw - what db's dont support parameters? Carl K From jeff at taupro.com Thu Aug 16 16:23:18 2007 From: jeff at taupro.com (Jeff Rush) Date: Thu, 16 Aug 2007 09:23:18 -0500 Subject: [DB-SIG] Seeking a Volunteer Speaker for a 5-Min Recorded Talk Message-ID: <46C45DD6.3010808@taupro.com> Greetings. As Python Advocacy Coordinator, I'm looking for someone who can record a 5-minute screencast about the DB-API interface. The audience is those who have not yet adopted Python or are just getting started with it. It will be part of a series of talks, "5-Minutes with Python", intended to give ten 5-minute talks on diverse topics about the wonderful aspects of Python. In this case it would be to raise awareness in the IT world of the strength of Python in the database arena and its diversity of DB connectors. Screencasting is a lot of fun and a valuable skill, in being able to explain a technical subject in a short time, and yet it lets you reshoot it to perfection, unlike a live talk. Talks can be either slideshows or demonstrations - for the DB-API probably a demonstration at the Python prompt would be best but we're open to ideas. You can check out other Python screencasts, at: http://www.showmedo.com/videos/python and the 5-minutes with Python series at: http://www.showmedo.com/videos/series?name=L3dNy3tjR Producing a screencast just requires a microphone on your computer, and any of several desktop screen recording packages, most of which are free. You set up your screen, talk and type. For 5-minutes you don't need video editing or special post-processing. If you might be interested, on DB-API or any other topic, please let me know. It's also a way to get a bit of fame in the Python community, with the talks being hosted on www.python.org: http://www.python.org/doc/av/5minutes/ Thanks, -Jeff From info at egenix.com Wed Aug 22 14:59:22 2007 From: info at egenix.com (eGenix Team: M.-A. Lemburg) Date: Wed, 22 Aug 2007 14:59:22 +0200 Subject: [DB-SIG] ANN: eGenix mxODBC Distribution 3.0.1 (mxODBC Database Interface) Message-ID: <46CC332A.9000306@egenix.com> ________________________________________________________________________ ANNOUNCING eGenix.com mxODBC Database Interface Version 3.0.1 Our commercially supported Python extension providing ODBC database connectivity to Python applications on Windows and Unix platforms This announcement is also available on our web-site for online reading: http://www.egenix.com/company/news/eGenix-mxODBC-Distribution-3.0.1-GA.html ________________________________________________________________________ INTRODUCTION The mxODBC Database Interface allows users to easily connect Python applications to just about any database on the market today - on both Windows and Unix platforms in a highly portable and convenient way. This makes mxODBC the ideal basis for writing cross-platform database programs and utilities in Python. mxODBC is included in the eGenix.com mxODBC Distribution for Python, which is part of the eGenix.com mx Extension Series - a collection of professional quality software tools aimed at enhancing Python's usability in many important areas such as database connectivity, fast text processing, date/time processing and web site programming. The package has proven its stability and usefulness in many mission critical applications and various commercial settings all around the world. It's been used in production for almost 10 years now. * About Python: Python is an object-oriented Open Source programming language which runs on all modern platforms (http://www.python.org/). By integrating ease-of-use, clarity in coding, enterprise application connectivity and rapid application design, Python establishes an ideal programming platform for todays IT challenges. * About eGenix: eGenix is a consulting and software product company focused on providing professional quality services and products to Python users and developers (http://www.egenix.com/). ________________________________________________________________________ NEWS mxODBC 3.0.1 is a patch-level release and includes the following updates: Enhanced SQL Server ODBC driver support: * work-around for using Unicode parameters together with cursor.executedirect() * bug-fix for cursor.stringformat setting when used with cursor.executedirect() * improved compatibility by using non-padding character binding codes Documentation: * added more documentation and an example of how to use the new connection and cursor error handlers. Error handlers were introduced in mxODBC 3.0.0 and allow for much greater control over how low-level errors in the interface are to be dealt with. For the full set of changes please check the mxODBC change log. ________________________________________________________________________ DOWNLOADS The download archives and instructions for installing the package can be found at: http://www.egenix.com/products/python/mxODBC/ IMPORTANT: In order to use the eGenix mxODBC package you will first need to install the eGenix mx Base package: http://www.egenix.com/products/python/mxBase/ ________________________________________________________________________ UPGRADING You are encouraged to upgrade to this latest mxODBC release, especially if you are using MS SQL Server as database server. Customers who have purchased mxODBC 3.0 licenses can download and install this patch-level release on top of their existing installations. The licenses will continue to work with version 3.0.1. Users of mxODBC 2.0 will have to purchase new licenses from our online shop in order to upgrade to mxODBC 3.0.1. You can request 30-day evaluation licenses by writing to sales at egenix.com, stating your name (or the name of the company) and the number of eval licenses that you need. We will then issue you licenses and send them to you by email. Please make sure that you can receive ZIP file attachments on the email you specify in the request, since the license files are send out as ZIP attachments. _______________________________________________________________________ SUPPORT Commercial support for these packages is available from eGenix.com. Please see http://www.egenix.com/services/support/ for details about our support offerings. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 22 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611