[Python-Dev] Unicode support in getargs.c

Martin v. Loewis martin@v.loewis.de
Wed, 2 Jan 2002 01:02:08 +0100


> True; "u#" does exactly the same as "s#" -- it interprets the
> input as binary buffer.

It doesn't do exactly the same. If s# is applied to a Unicode object,
it transparently invokes the default encoding, which is sensible.  If
u# is applied to a byte string, it does not apply the default encoding.

Instead, it interprets the string "as-is". I cannot see an application
where this is useful, but I can see many applications where it is
clearly wrong.

IMO, u# cannot and should not be symmetric to s#. Instead, it should
accept just Unicode objects, and raise TypeErrors for everything else.

Regards,
Martin