[Python-3000] Regular expressions, py3k and unicode

Terry Reedy tjreedy at udel.edu
Sun Jun 29 02:00:36 CEST 2008



Guido van Rossum wrote:
> On Sat, Jun 28, 2008 at 1:45 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> Several posters (including a certain GvR) in the bug tracker (*) have been
>> baffled by an apparent bug where the re.IGNORECASE flag didn't imply
>> case-insensitivity for non-ASCII characters. It turns out that, although the
>> pattern was a string object and although Py3k is supposed to be
>> unicode-friendly, you still need to supply the re.UNICODE flag if you want the
>> re module to use unicode-aware case-insensitive matching.
>>
>> Wouldn't it be more natural that, at least when the pattern is a str object
>> rather a bytes object, the re.UNICODE be implied by default?
> 
> +1

Would there be any reason (I do not know) to replace that with an 
re.ASCII flag to have the reverse effect (assuming there is not now)?



More information about the Python-3000 mailing list