Splitting a string

Sun Apr 4 05:58:25 EDT 2010

Steven D'Aprano wrote:

> On Sat, 03 Apr 2010 11:17:36 +0200, Peter Otten wrote:
> 
>>> That's certainly faster than a list comprehension (at least on long
>>> lists), but it might be a little obscure why the "if not s:" is needed,
>> 
>> The function is small; with a test suite covering the corner cases and
>> perhaps a comment* nothing should go wrong.
>> 
>> (*) you can certainly improve on my attempt
>> 
>>> so unless Thomas has a really long result list, he might want to just
>>> keep the list comprehension, which is (IMO) very readable.
>> 
>> Generally speaking performing tests of which you know they can't fail
>> can confuse the reader just as much as tests with unobvious
>> interdependencies.
> 
> 
> I'm not sure I agree with you.

I'm going to help you make up your mind ;)

> Tests which you know can't fail are called assertions, pre-conditions and
> post-conditions. We test them because if we don't, they will fail :)

Note that I said /can/ not /do/ confuse. Consider the actual context, 
removing a special value from the start/end of a list:

(1)
if parts[0] == "":
    del parts[0]
    if not parts: return parts
if parts[-1] == "":
    del parts[-1]
return parts

(2)
return [item for item in parts if item != ""]

Now assume you have to familiarize yourself with the above code. Variant (1) 
clearly expresses that the code is meant to touch only the first and last 
item. (2) could be just removing all empty strings from a list.

Another way to look at it: it is much easier to refactor from (1) to (2) 
than from (2) to (1).

As to assertions etc: you don't perform arbitrary tests like assert 42 == 
42, you make estimates about what can go wrong if you get unexpected input 
or make an error in an algorithm that you cannot safely grasp in its 
entirety. As an example you could add

assert "" not in parts[1:-1]

as a precondition to the above snippets. The superfluous tests in the list 
comprehension are the opposite of that assertion: they could eat items == "" 
in the list that either are intended to pass through or that indicate an 
error in code above.

Personally, though, I prefer unit tests over assertions.

Peter