[New-bugs-announce] [issue23692] Undocumented feature prevents re module from finding certain matches
Evgeny Kapun
report at bugs.python.org
Tue Mar 17 19:49:44 CET 2015
New submission from Evgeny Kapun:
This pattern matches:
re.match('(?:()|(?(1)()|z)){2}(?(2)a|z)', 'a')
But this doesn't:
re.match('(?:()|(?(1)()|z)){0,2}(?(2)a|z)', 'a')
The difference is that {2} is replaced by {0,2}. This shouldn't prevent the pattern from matching anywhere where it matched before.
The reason for this misbehavior is a feature which is designed to protect re engine from infinite loops, but in fact it sometimes prevents patterns from matching where they should. I think that this feature should be at least properly documented, by properly I mean that it should be possible to reconstruct the exact behavior from documentation, as the implementation is not particularly easy to understand.
----------
components: Regular Expressions
messages: 238330
nosy: abacabadabacaba, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: Undocumented feature prevents re module from finding certain matches
type: behavior
versions: Python 3.4
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue23692>
_______________________________________
More information about the New-bugs-announce
mailing list