[Python-ideas] Adding str.isascii() ?

Steven D'Aprano steve at pearwood.info
Fri Jan 26 20:27:18 EST 2018


On Fri, Jan 26, 2018 at 02:37:14PM +0100, Victor Stinner wrote:
> 2018-01-26 13:39 GMT+01:00 Steven D'Aprano <steve at pearwood.info>:
> > I have no objection to isascii, but I don't think it goes far enough.
> > Sometimes I want to know whether a string is compatible with Latin-1 or
> > UCS-2 as well as ASCII. For that, I used a function that exposes the
> > size of code points in bits:
> 
> Really? I never required such check in practice. Would you mind to
> elaborate your use case?

tcl/tk and Javascript only support UCS-2 (16 bit) Unicode strings. 
Dealing with the Supplementary Unicode Planes have the same problems 
that older "narrow" builds of Python sufferred from: single code points 
were counted as len(2) instead of len(1), slicing could be wrong, etc.

There are still many applications which assume Latin-1 data. For 
instance, I use a media player which displays mojibake when passed 
anything outside of Latin-1.

Sometimes it is useful to know in advance when text you pass to another 
application is going to run into problems because of the other 
application's limitations.


-- 
Steve


More information about the Python-ideas mailing list