[Python-Dev] bytes type discussion

Stephen J. Turnbull stephen at xemacs.org
Fri Feb 17 07:11:12 CET 2006


>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:

    Guido> I think that the implementation of encoding-guessing or
    Guido> auto-encoding-upgrade techniques should be left out of the
    Guido> standard library design for now.

As far as I can see, little new design is needed.  There's no reason
why an encoding-guesser couldn't be written as a codec that detects
the coding, then dispatches to the appropriate codec.  The only real
issue I know of is that if you ask such a codec "who are you?", there
are two plausible answers: "autoguess" and the codec actually being
used to translate the stream.  If there's no API to ask for both of
those, the API might want generalization.

    Guido> As far as searching bytes objects, that shouldn't be a
    Guido> problem as long as the search 'string' is also specified as
    Guido> a bytes object.

You do need to be a little careful in implementation, as (for example)
"case insensitive" should be meaningless for searching bytes objects.
This would be especially important if searching and collation become
more Unicode conformant.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


More information about the Python-Dev mailing list