[Python-Dev] bytes / unicode
P.J. Eby
pje at telecommunity.com
Sun Jun 27 05:49:11 CEST 2010
At 12:43 PM 6/27/2010 +1000, Nick Coghlan wrote:
>While full support for third party strings and
>byte sequence implementations is an interesting idea, I think it's
>overkill for the specific problem of making it easier to write
>str/bytes agnostic functions for tasks like URL parsing.
OTOH, to write your partial implementation is almost as complex - it
still must take into account joining and formatting, and so by that
point, you've just proposed a new protocol for coercion... so why
not just make the coercion protocol explicit in the first place,
rather than hardwiring a third type's worth of special cases?
Remember, bytes and strings already have to detect mixed-type
operations. If there was an API for that, then the hardcoded special
cases would just be replaced, or supplemented with type slot checks
and calls after the special cases.
To put it another way, if you already have two types special-casing
their interactions with each other, then rather than add a *third*
type to that mix, maybe it's time to have a protocol instead, so that
the types that care can do the special-casing themselves, and you
generalize to N user types.
(Btw, those who are saying that the resulting potential for N*N
interaction makes the feature unworkable seem to be overlooking
metaclasses and custom numeric types -- two Python features that in
principle have the exact same problem, when you use them beyond a
certain scope. At least with those features, though, you can
generally mix your user-defined metaclasses or numeric types with the
Python-supplied basic ones and call arbitrary Python functions on
them, without as much heartbreak as you'll get with a from-scratch
stringlike object.)
All that having been said, a new protocol probably falls under the
heading of the language moratorium, unless it can be considered "new
methods on builtins"? (But that seems like a stretch even to me.)
I just hate the idea that functions taking strings should have to be
*rewritten* to be explicitly type-agnostic. It seems *so*
un-Pythonic... like if all the bitmasking functions you'd ever
written using 32-bit int constants had to be rewritten just because
we added longs to the language, and you had to upcast them to be
compatible or something. Sounds too much like C or Java or some
other non-Python language, where dynamism and polymorphy are the
special case, instead of the general rule.
More information about the Python-Dev
mailing list