[Python-Dev] parser markers vs. conversion functions (unicode/string asymmetries)

Martin v. Loewis martin@v.loewis.de
Wed, 9 Jan 2002 11:17:30 +0100


> Why do you think that adding the conversion functions to getargs.c
> would be any different from adding new parser markers ? 

For two reasons:

- people who want portability across Python versions can better
  maintain their source code. They just need to provide a definition
  of the conversion function for older Python versions, which they
  can copy literally from the more recent version.
- the code becomes more readable, since function names are more
  self-documenting than single letter codes.

> As I understand "O&", it is meant for user-space conversion functions, 
> not system provided ones. 

It may have been originally defined for that purpose. I believe it 
would useful to provide a standard library of such functions.

> Unless, of course, you want to start shifting from parser markers to
> conversion functions completely (which I doubt).

I would, in fact, prefer if the set of conversion codes is frozen, and
extended only for cases that are likely to get wide applicability. I
believe many of the codes invented for Unicode have never been used in
any module, it seems that some have been invented just for an abstract
notion of "symmetry".

> Note that "O&" doesn't really buy you anything much: you could
> just as well use "O" and then switch on the returned object
> type or call a converter (with all the extra error handling
> or other extra information needed for your particular case).

People are apparently fond of a single function that simultaneously
checks the validity of all arguments. If it fails, it will completely
clean up.

That makes me wonder about the existing converters and their cleanup
capabilities: Suppose I do

  char *buffer = NULL;
  int i;
  if (PyArg_ParseTuple(args, "eti", &buffer, &i))
    return NULL;

Now suppose I pass a Unicode object for the first argument, and a list
for the second. Is it true that this code will leak? since the first
argument has already been converted, and the second leads to an error,
the encoded string has already been produced.

> In the end, I don't believe we gain much from beefing up the
> "O&" interface. I'd rather like to see the Unicode parser
> markers extended to be more useful (I'll checkin a patch for
> "u#" later today).

How will that deal with string objects?

Regards,
Martin