[New-bugs-announce] [issue23692] Undocumented feature prevents re module from finding certain matches

Evgeny Kapun report at bugs.python.org
Tue Mar 17 19:49:44 CET 2015


New submission from Evgeny Kapun:

This pattern matches:

    re.match('(?:()|(?(1)()|z)){2}(?(2)a|z)', 'a')

But this doesn't:

    re.match('(?:()|(?(1)()|z)){0,2}(?(2)a|z)', 'a')

The difference is that {2} is replaced by {0,2}. This shouldn't prevent the pattern from matching anywhere where it matched before.

The reason for this misbehavior is a feature which is designed to protect re engine from infinite loops, but in fact it sometimes prevents patterns from matching where they should. I think that this feature should be at least properly documented, by properly I mean that it should be possible to reconstruct the exact behavior from documentation, as the implementation is not particularly easy to understand.

----------
components: Regular Expressions
messages: 238330
nosy: abacabadabacaba, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: Undocumented feature prevents re module from finding certain matches
type: behavior
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue23692>
_______________________________________


More information about the New-bugs-announce mailing list