[docs] [issue24896] It is undocumented that re.UNICODE and re.LOCALE affect re.IGNORECASE

Serhiy Storchaka report at bugs.python.org
Wed May 24 01:51:07 EDT 2017


Serhiy Storchaka added the comment:

Actually the locale affects case-insensitively matching if use the re.LOCAL flag. The set of characters matched by b'[A-Z]' is locale-depending. For example in Turkish locale it can include the letters 'İ' and 'ı'. Only 8-bit locales are supported, not UTF-8 locales.

In Unicode case-insensitive mode the expression '[A-Z]' matches not only Latin uppercase and lowercacase letters A-Z and a-z, but also characters 'İ', 'ı', 'ſ', and 'K'.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24896>
_______________________________________


More information about the docs mailing list