Behavior of re.split on empty strings is unexpected

John Nagle nagle at animats.com
Mon Aug 2 15:41:13 EDT 2010


On 8/2/2010 11:02 AM, MRAB wrote:
> John Nagle wrote:
>> The regular expression "split" behaves slightly differently than
>> string split:
occurrences of pattern", which is not too helpful.
>>
> It's the plain str.split() which is unusual in that:
>
> 1. it splits on sequences of whitespace instead of one per occurrence;

    That can be emulated with the obvious regular expression:

	re.compile(r'\W+')

> 2. it discards leading and trailing sequences of whitespace.

    But that can't, or at least I can't figure out how to do it.

> It just happens that the unusual one is the most commonly used one, if
> you see what I mean! :-)

    The no-argument form of "split" shouldn't be that much of a special
case.

					John Nagle




More information about the Python-list mailing list