Pyton re module and POSIX equivalence classes

Tim Chase python.list at tim.thechases.com
Mon Jun 1 16:29:30 EDT 2015


Is Python supposed to support POSIX "equivalence classes"?  I tried
the following in Py2 and Py3:

  >>> re.sub('[[=a=]]', 'A', 'aáàãâä', re.U)
  'aáàãâä'

which suggests that it doesn't (I would have expected "AAAAAA" as the
result).

Is there a way to get this behavior?

I found that perl knows about them but treats them as an exception
for now[1].  Supposedly GNU awk (and other GNU POSIXish tools)
recognize character classes, as does vim.

Thanks,

-tkc



[1]
http://perldoc.perl.org/perlrecharclass.html








More information about the Python-list mailing list