[DB-SIG] pyformat Parameter Style

Fri May 16 14:46:05 EDT 2003

Summary: I think the DB-API implementations should either
stick to the SQL standard (? or :name) or stick to the
Python standard (all %-codes work). I think it's bad to
choose a third path (only %s works). Going from %s only
to full %-code support is trivial. I still think the
escape/quote thingie for strings should be done.

I guess Chris will read the rest as well. :)

At 19:13 2003-05-15 -0700, Chris Cogdon wrote:
>The interface writer is in the best position to know if the backend 
>requires special treatment. But... that doesn't stop them using the normal 
>python % operator if he knows it'll accept anything that python will spit 
>out, or making the appropriate adjustments if there are some exceptions.

Agreed. And I will claim until proven wrong :) that my three
line patch makes this possible for sqlite, and that it's more
or less as simple for other database drivers.

Obviously, a programmer can abuse %f etc in such a way that
it won't work, but I think that is the responsibility of the
programmer.

>>If I am to write an application for a certain RDBMS with the
>>DB-API I have to know how this deviates from the SQL standard
>>on issues like string concatenation, join syntax, what aggregate
>>functions it has, what datatypes I can use, to what extent it
>>supports things like sub-selects or updatable views etc, but I
>>shouldn't be burdened with knowing how to write floats so that
>>they are understood by the database? I don't buy that.
>
>That would imply that ALL SQL coding decisions are best left to the 
>application programmer, including how to quote and embed parameters.

No, I'm not saying that. I'm saying that the programmer doesn't
need to be "protected" from understanding how numbers work in SQL.

He doesn't need to be "protected" from knowing how escaping and
quoting works either, but having that done right automatically
is a convenience. Getting TypeError thrown for no good reason
when you use %f in the SQL is *not* a convenince, it's an
inconvenience.

>The existing system (IMHO) is a compromise between getting the application 
>programmer to do everything, and defining a 'SQL-neutral' language that 
>the interface programmer then has to turn into 'real' SQL for the backend. 
>Ie, the existing interface says "you need to know what kind of SQL will 
>work for your circumstance, but I'll handle the parameter passing for you".

My main problem is that using both the implementation and the
syntax of the %-operator does suggest the %d and %f etc will
work. I want to reduce the cognitive burden on the programmer
and let him enjoy a programming experience where there are as
few exceptions from the general rules as possible.

If we don't want the application programmer to get involved in
type formatting, we should not provide the cognitive cues that
this is possible. It would be much better to use ? or :name
as per the SQL standard. (There are four permitted binding styles
in SQL92, two (module and embedded) use :name syntax, and the other
two (direct and CLI) use ?.)

I'll get back to this in a separate mail.

>I STILL believe the better way to handle something like floats is to 
>create a fixed_point type.

I also want a fixed point type in Python. We agree again! :)

>Nothing stopping us writing our own types/classes. Many interfaces do this 
>already for date/time types.

But it's difficult to write database interfaces to support new
types or classes written by the application programmer...

I think it's a good thing that a new datetime type is added to
2.3, and I hope a good fixed point type will be in 2.4. Until
then, this will be a bit shaky.

>The interface DOES know about application context, based on what types 
>it's being passed. If you want to do something special, then you create 
>your own type like many DBIs currently do.

But the application programmer might have different ideas
about this than the DBI programmer. We can't expect the
DBI programmer to be a mind reader. A way to get around this
regardless of paramstyle would be to have some standard hook
for custom conversion, like pysqlite has:

     elif hasattr(value, '_quote'):
         return value._quote()

I guess a better name for a standard interface might be something
like 'db_format()' or '__db_str__()'.

>But, if we're allowing exceptions for %s (ie, it quotes for you based on 
>what the backend requires) then we're never going to achieve 
>orthogonality, nor transparency. My argument is that the parameter passing 
>mechanism was never INTENDED to be orthogonal to the python '%' operator, 
>but it was just merely a convenient, and available mechanism to use.

The exceptions are not for %s in the SQL, they are for
string objects in the parameter.

But I agree that format and pyformat was there to be convenient,
and I'd really like to see as few supported paramstyles as
possible. The best thing would be is all drivers would support
one of ? and :name so that we could write at least trivial SQL
statements in a portable way.

>The problem is... once you say you can use %i, %f and %s (and I've already 
>agreed that there are good reasons to allow the flexibility in %d), then 
>people are going to assume that you can use %r (which I've already argued 
>doesn't make ANY sense, since repr() does not return values that are 
>acceptable SQL), and %x (many DMBSes will only accept decimal integers).

In general I say: Make things consistent and transparent and
let the application programmer be in charge. Give him the right
tools, try to make life easy for him, but don't try to tie
him up in an effort to possibly avoid problems that we don't
know exist in the real world.

In this particular case I'm starting to lean towards suggesting
that format and pyformat should be deprecated, but if the
programmer can't understand when %r will give the wrong result
in SQL, he won't be able to use %r correctly in a print statement
to the display or to a file either.

>Well, there's another problem. In python %f will raise a TypeError if you 
>try to pass it None :)

Exatly. Just as with a print statement. You can only use %f
if you are certain that you will always feed this with a
float. If you can't handle that in the SQL, you can't handle
that in "print fmt % params". The programmer must understand this.

>As far as DBI-2.0 is concerned, we're a little stuck because of the 
>vagueness. I think it would be unwise to make many of the DBI's out there 
>'incorrect' by saying "oh, we really meant for you to support %f".  For 
>the moment, it might be wise to state that the only formatting code that 
>is predicated by the spec is %s, but some databases will allow you to use 
>%f as you might expect, as well as %i. The interpretation for %r I think 
>is even more varied.

They *are* incorrect since the DB-API 2.0 *is* stating that
they should all be supported. I'm not in favour of changing
the DB-API to restrict format and pyformat to %s. I understand
that people have interpreted the DB-API differently but it seems
to me that the text is actually not ambiguous.

>I argue that we really need to define the documentation better, specifying 
>a 'must support' behaviour that encompasses the existing interpretations 
>that %s should be able to handle all data types, plus the option to allow 
>other formatting options in a 'known' behaviour. The 2.0 spec be updated 
>to state that the only option that has 'guaranteed' support is %s, and 
>then we work on whether we want to add in things like %f and %i for a 
>later specification of the documentation.

I think it would be better if we could try to move away from
format and pyformat, and try to support one of the who
binding styles in the SQL standard.

--
Magnus Lycka (It's really Lyck&aring;), magnus at thinkware.se
Thinkware AB, Sweden, www.thinkware.se
I code Python ~ The shortest path from thought to working program