[New-bugs-announce] [issue11204] re module: strange behaviour of space inside {m, n}

John Machin report at bugs.python.org
Sun Feb 13 00:19:58 CET 2011


New submission from John Machin <sjmachin at lexicon.net>:

A pattern like r"b{1,3}\Z" matches "b", "bb", and "bbb", as expected. There is no documentation of the behaviour of r"b{1, 3}\Z" -- it matches the LITERAL TEXT "b{1, 3}" in normal mode and "b{1,3}" in verbose mode.

# paste the following at the interactive prompt:
pat = r"b{1, 3}\Z"
bool(re.match(pat, "bb")) # False
bool(re.match(pat, "b{1, 3}")) # True
bool(re.match(pat, "bb", re.VERBOSE)) # False
bool(re.match(pat, "b{1, 3}", re.VERBOSE)) # False
bool(re.match(pat, "b{1,3}", re.VERBOSE)) # True

Suggested change, in decreasing order of preference:
(1) Ignore leading/trailing spaces when parsing the m and n components of {m,n}
(2) Raise an exception if the exact syntax is not followed
(3) Document the existing behaviour

Note: deliberately matching the literal text would be expected to be done by escaping the left brace:

pat2 = r"b\{1, 3}\Z"
bool(re.match(pat2, "b{1, 3}")) # True

and this is not prevented by the suggested changes.

----------
messages: 128472
nosy: sjmachin
priority: normal
severity: normal
status: open
title: re module: strange behaviour of space inside {m, n}
versions: Python 2.7, Python 3.1

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11204>
_______________________________________


More information about the New-bugs-announce mailing list