[New-bugs-announce] [issue41080] re.sub treats * incorrectly?

Ryan Westlund report at bugs.python.org
Mon Jun 22 13:28:11 EDT 2020


New submission from Ryan Westlund <rlwestlund at gmail.com>:

```
>>> re.sub('a*', '-', 'a')
'--'
>>> re.sub('a*', '-', 'aa')
'--'
>>> re.sub('a*', '-', 'aaa')
'--'
```

Shouldn't it be returning one dash, not two, since the greedy quantifier will match all the a's? I understand why substituting on 'b' returns '-a-', but shouldn't this constitute only one match? In Python 2.7, it behaves as I expect:

```
>>> re.sub('a*', '-', 'a')
'-'
>>> re.sub('a*', '-', 'aa')
'-'
>>> re.sub('a*', '-', 'aaa')
'-'
```

The original case that led me to this was trying to normalize a path to end in one slash. I used `re.sub('/*$', '/', path)`, but a nonzero number of slashes came out as two.

----------
components: Regular Expressions
messages: 372104
nosy: Yujiri, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.sub treats * incorrectly?
type: behavior
versions: Python 3.10, Python 3.7, Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41080>
_______________________________________


More information about the New-bugs-announce mailing list