[DB-SIG] In praise of pyformat

Mon Aug 13 16:37:49 CEST 2007

On Sun, 12 Aug 2007 21:51:33 -0400 Carsten Haese <carsten at uniqsys.com> wrote:
> On Sun, 2007-08-12 at 18:12 -0400, Mike Meyer wrote:
> > You're right, in that the existing mechanisms *can* deal with the
> > issues. However, two of the points that comes up over and over again
> > here is "use parameters, don't build the query strings yourself" and
> > "we would rather the module authors do the work than the users". I'm
> > trying to figure out how *either* of those is miscible with "Just use
> > pythons string substitutions for table/column names", much less *both*
> > of them.
> You're right, not having a cross-database mechanism for building queries
> out of variable parts does not quite jibe with those two principles.
> Keep in mind, however, that of the two problems of filling variable
> values into a query and filling variable table/column names into a query
> are two very different problems, and in most applications, the first one
> is more common than the second one by a factor of about a million to one
> or so.

While I think your order is a little exaggerated, I'll merely point
out that it's a common thing to see when you're writing code that
writes code. SQL pretty much sucks for this, but Python isn't to bad -
and it's one of the most powerful programming technics available - I
seem to use it in every other application. So I'd expect it to become
more common, not less.

> Even if we find a solution that fits every use case, simply having a
> mechanism for plugging table and column names into a query is not enough
> for a general framework for writing cross-database applications. What
> about syntax differences in "limit" queries? What about different names
> for built-in functions, etc. Any solution to this problem will either be
> incomplete or such a behemoth that it will be next to impossible to
> implement a compliant API module.

You're right - we can't provide a perfect solution. Fortunately, we
don't have to, as the goal is progress, not perfection. That being the
case is why there was a dbapi, and a second version, and now a
proposal for a third version with progress along two lines:

1) requiring qmark. This will be easy for most modules, since an SQL
engine that supports the standard will already have it, so this is
making module authors paper over a hole in the underlying SQL engine
they support.

2) requiring named. This isn't a standard, but is in wide use because
it provides a desperately needed bit of functionality. So this is
making module authors paper over a hole in the SQL standards.

Note that these two solutions aren't complete abstractions by any
means; they depend on the vagaries of the SQL implementation and/or
the module authors. What they are is a portable way to access the
parameter mechanism of the underlying database, or an emulation of
that if it doesn't exist.

Adding a real tool to let code write queries - as opposed to abusing
parameter binding for the job - is another case of making module
authors paper over a hole in the SQL standards.

Since the examples of qmark and named show that we're not requiring
perfect portability, but merely progress, we can look at your question
(I've moved it) with that in mind.

> Let's say hypothetically we add something to the DB-API for plugging
> table names and column names into a query. Older versions of Informix
> restrict identifiers to 18 characters. What is the DB-API supposed to do
> if you try to plug in a table name that's longer? Truncate the name?
> Raise an exception? Something else?

Oracle has a similar - though not quite so unreasonable -
restriction. I believe it's still in place. All of the above seem to
be reasonable choices for this case. But what does DB-API say happens
if I use such an identifier in code that's written by hand instead of
by other code? I can't find any reference to this. Why should the
DB-API spec have to deal with what are essentially errors in the SQL
for SQL written by code differently than it does code written by hand?

> The sheer size of the problem of database abstraction and the fact that
> there is no one solution that fits all is the reason why there are many
> different solutions such as SQLObject, SQLAlchemy, Dabo, Django, etc
> already in the wild. It's also the reason why I doubt that DB-API is
> going to grow such a query construction toolkit layer any time soon.

I'm not asking for a complete abstraction, or any kind of
abstraction. I'm asking for a portable way to apply one of the most
powerful tools in the programmers toolbox to SQL. Sure, the resulting
SQL might not be portable - that can't be helped. But at least I don't
have to tweak the Python every time I change databases or modules,
just the SQL. That's progress - and that's the goal.

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.