[issue28937] str.split(): allow removing empty strings (when sep is not None)

Andrei Kulakov report at bugs.python.org
Wed Jun 2 13:42:54 EDT 2021


Andrei Kulakov <andrei.avk at gmail.com> added the comment:

I'm not sure I understand why the discussion was focused on removing *all* empty values.

Consider this in a context of a cvs-like string:

1. 'a,b,c' => [a,b,c]    # of course
2. ',,'    => ['','',''] # follows naturally from above
3. ''      => []         # arguably most intuitive
4. ''      => ['']       # less intuitive but can be correct

>From the point of view of intent of the initial string, the first two
are clear - 3 values are provided, in 2) they just happen to be empty.
It's up to the later logic to skip empty values if needed.

The empty string is ambiguous because the intent may be no values or a single empty value.

So ideally the new API would let me choose explicitly between 3) and 4). But I don't see why it would affect 2) !!

The processing of 2) is already not ambiguous. That's what I would want any version of split() to do, and later filter or skip empty values.

Current patch either forces me to choose 4) or to explicitly choose but
also break normal, "correct" handling of 2). 

It can lead to bugs as follows:

Let's say I have a csv-like string:

col1,col2,col3
1,2,3

a,b,c

I note that row 2 creates an empty col1 value, which is probably not what I want. I look at split() args and think that keepempty=False is designed for this use case. I use it in my code. Next time the code will break when someone adds a row:

a,,c

----------
nosy: +andrei.avk

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue28937>
_______________________________________


More information about the Python-bugs-list mailing list