[Python-Dev] _PyUnicode_CheckConsistency() too strict?

Guido van Rossum guido at python.org
Mon Feb 3 18:52:00 CET 2014


Can we provide a convenience API (or even a few lines of code one could
copy+paste) that determines if a particular 8-bit string should  have
max-char equal to 127 or 255? I can easily imagine a number of use cases
where this would come in handy (e.g. a list of strings produced by
translation, or strings returned in Latin-1 by some other non-Python
C-level API) -- and let's not get into a debate about whether UTF-8
wouldn't be better, I can also easily imagine legacy APIs where that isn't
(yet) an option.


On Mon, Feb 3, 2014 at 9:35 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Mon, 03 Feb 2014 16:10:03 +0000
> Phil Thompson <phil at riverbankcomputing.com> wrote:
> >
> > Why is a Latin-1 string considered inconsistent just because it doesn't
> > happen to contain any characters in the range 128-255?
>
> Because as Victor said, it allows for some optimization shortcuts (e.g.
> a non-ASCII latin1 string cannot be equal to an ASCII string - no need
> for a memcmp).
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140203/9e22ba43/attachment.html>


More information about the Python-Dev mailing list