[issue35859] Capture behavior depends on the order of an alternation
Ma Lin
report at bugs.python.org
Mon Mar 4 04:59:51 EST 2019
Ma Lin <malincns at 163.com> added the comment:
Found another bug in re:
>>> re.match(r'(?:.*?\b(?=(\t)|(x))x)*', 'a\txa\tx').groups()
('\t', 'x')
Expected result: (None, 'x')
PHP 7.3.2 NULL, "x"
Java 11.0.2 "\t", "x"
Perl 5.28.1 "\t", "x"
Ruby 2.6.1 nil, "x"
Go 1.12 doesn't support lookaround
Rust 1.32.0 doesn't support lookaround
Node.js 10.15.1 undefined, "x"
regex 2019.2.21 None, "x"
re "\t", "x"
This is a very rare bug, can be fixed by adding MARH_PUSH() before JUMP_MIN_REPEAT_ONE. And maybe other JUMPs should MARK_PUSH() as well.
I'm impressed with regex module, it never went wrong.
IMHO, I would like to see a pruned version be adopted into stdlib.
~~~~~~~~~~~~~~~~~~~~~~
> Interesting sidelights 1
> Found a Perl bug
I reported to Perl, it's a bug in perl-5.26, and already fixed in perl-5.28.0.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35859>
_______________________________________
More information about the Python-bugs-list
mailing list