[issue2636] Adding a new regex module (compatible with re)
Matthew Barnett
report at bugs.python.org
Thu Sep 1 19:50:50 CEST 2011
Matthew Barnett <python at mrabarnett.plus.com> added the comment:
The regex module supports nested sets and set operations, eg. r"[[a-z]--[aeiou]]" (the letters from 'a' to 'z', except the vowels). This means that literal '[' in a set needs to be escaped.
For example, re module sees "[][()]..." as:
[ start of set
] literal ']'
[() literals '[', '(', ')'
] end of set
... ...
but the regex module sees it as:
[ start of set
] literal ']'
[()] nested set [()]
... ...
Thus:
>>> s = u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub(r'(?<=[][()]) |(?!,) (?!\[,)(?=[][(),])', '', s)
u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub('(?<=[]\[()]) |(?!,) (?!\[,)(?=[]\[(),])', '', s)
u'void foo(type arg1 [, type arg2])'
If it can't parse it as a nested set, it tries again as a non-nested set (like re), but there are bound to be regexes where it could be either.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2636>
_______________________________________
More information about the Python-bugs-list
mailing list