Pyton re module and POSIX equivalence classes

MRAB python at mrabarnett.plus.com
Mon Jun 1 19:47:11 EDT 2015


On 2015-06-01 23:48, Mark Lawrence wrote:
> On 01/06/2015 21:29, Tim Chase wrote:
>> Is Python supposed to support POSIX "equivalence classes"?  I tried
>> the following in Py2 and Py3:
>>
>>    >>> re.sub('[[=a=]]', 'A', 'aáàãâä', re.U)
>>    'aáàãâä'
>>
>> which suggests that it doesn't (I would have expected "AAAAAA" as the
>> result).
>>
>> Is there a way to get this behavior?
>>
>> I found that perl knows about them but treats them as an exception
>> for now[1].  Supposedly GNU awk (and other GNU POSIXish tools)
>> recognize character classes, as does vim.
>>
>> Thanks,
>>
>> -tkc
>>
>> [1]
>> http://perldoc.perl.org/perlrecharclass.html
>>
>
> I wouldn't know directly as I tend to avoid them like the plague, but if
> not are they in the "new" regex module, see
> https://pypi.python.org/pypi/regex/2015.05.28 and/or
> http://bugs.python.org/issue2636 ???
>
The regex module has POSIX character classes [[:alpha:]], but not
POSIX equivalence classes.




More information about the Python-list mailing list