[issue17668] re.split loses characters matching ungrouped parts of a pattern

Tomasz J. Kotarba report at bugs.python.org
Mon Apr 8 21:20:16 CEST 2013


Tomasz J. Kotarba added the comment:

Hi Matthew,

Thanks for such a quick reply.  I know I can get the > by putting it in grouping parentheses.  That's not the issue here.  The documentation you quoted says that it splits the string by the occurrences _OF_PATTERN_ and that texts of all groups are _ALSO_ returned as _PART_ of the resulting list.  It does not say anywhere (nor does it even suggest that) that parts of the pattern not grouped with parentheses are REMOVED.

That said, I did not report this issue to split hairs (I would rather split strings with regular expressions ;)) and perform liguistic analysis of the current documentation (which is not set in stone and has been changed before).  I did that because I spotted an issue which slightly limits usefulness of re.split() and suggested a potential improvement which would solve the problem and make re.split() even better than it already is.  Whether the powers that be do something with this and improve re.split() is of course not my decision.

Cheers,
T

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17668>
_______________________________________


More information about the Python-bugs-list mailing list