[Python-ideas] isascii()/islatin1()/isbmp()

Serhiy Storchaka storchaka at gmail.com
Sat Jun 30 18:03:10 CEST 2012


As shown in issue #15016 [1], there is a use cases when it is useful to 
determine that string can be encoded in ASCII or Latin1. In working with 
Tk or Windows console applications can be useful to determine that 
string can be encoded in UCS2. C API provides interface for this, but at 
Python level it is not available.

I propose to add to strings class new methods: isascii(), islatin1() and 
isbmp() (in addition to such methods as isalpha() or isdigit()). The 
implementation will be trivial.

Pro: The current trick with trying to encode has O(n) complexity and has 
overhead of exception raising/catching.

Contra: In most cases after determining characters range we still need 
to encode a string with the appropriate encoding. New methods will 
complicate already overloaded strings class.

Objections?

[1] http://bugs.python.org/issue15016




More information about the Python-ideas mailing list