Deleting the first element of a list

Alex Martelli aleax at aleax.it
Thu Oct 3 08:42:38 EDT 2002


Peter Hansen wrote:
        ...
>> "del alist[0]" must copy N-1 references; "alist = alist[1]" must
>> perform the same copies plus one allocation and one freeing.  So,
>> the former's performance "can't" (with reasonable certainty) be
>> worse than the latter's, but you can't predict how much better --
>> surely not across platforms.  Still, if you must choose between
>> the two ways of expressing functionality that is equivalent to
>> your application, the former is clearly better.
> 
> Taking note again of Timothy Delaney's point that in fact the
> functionality is *not* equivalent: del alist[0] will throw
> an exception if the list is empty, while slicing will not.

That's why I qualified my last sentence: when the functionality
is NOT equivalent to your application, then you must let that
lead your choice.  Besides empty lists, there's also the case
of multiple references to the same list: if names a and b refer
to the same list object, del a[0] affects that one object, while
rebinding a = a[1:] doesn't.  When such issues matter, you must
choose the expression that does whatever is right for your app.


> I think this all just reinforces the claim that these choices
> should generally be made on the basis of readability rather than
> performance.

Choices between idioms with similar big-O characteristics should
generally be made on the basis of readability.  However, when two
choices have different O(f(N)) characteristics, then [unless you can
put strict bounds on N or already have elsewhere bottlenecks with
O(f(N)) behavior bad enough to make the given choice immaterial] it's
different.  I surely don't mind paying an extra 10% of runtime, and
may not mind 100%, but the runtime % overhead in such big-O-issues
cases is unbounded, and growing with N.

For example, some people appear to find "Cramer's rule" a "readable"
way to solve systems of linear equations (maybe that's because it's
the one I learned in high school, whatever).  It doesn't matter,
though, unless you KNOW the number N of equations and unknowns is
definitely tiny: Cramer's rule's performances scales up *SO* badly
(O(N!), I think I recall) that you just CAN'T use it unless you
KNOW that N is no worse than 3 or 4.  The price of "readability" in
such cases might be to make your program totally unusable in some
common cases.  And MY order of priority is always Vitruvius'
"firmitas, utilitas, venustas", NOT Alberti's venustas-first one --
solidity, usefulness, and delight, in *THIS* order (Alberti's actual
_practice_, I claim, followed Vitruvius' order which I approve, not
the one he _wrote_ about...:-).

You might claim this is an issue of choice of algorithm and not of
idiom, but to me that doesn't make much difference.  Some people
may find the following idiom more readable (they certainly DO,
because it keeps popping up...):

    bigstring = ''
    while somecondition():
        bigstring += onemoretinypiece()

but it doesn't matter -- it's still the wrong idiom to use in
Python, as its big-O behavior is a disaster.  You can take your 
pick between other idioms with roughly O(N) behavior (you can use
a cStringIO instance, an array, etc, etc), though my favorite is:

    templist = []
    while somecondition():
        templist.append(onemoretinypiece())
    bigstring = ''.join(templist)


Choice between idioms with similar O() behavior is quite properly
driven by 'venustas' -- you don't really mind paying for that
with a slowdown or, say, 20%, except in rather rare cases -- but
that doesn't generalize -- *exactly* as it doesn't for choices
of data structures and algorithms.  If you can train yourself to
find more readable the version that performs better, so much
the better, of course -- and human ability for rationalization
being almost unbounded, you generally CAN, if you wish:-).


Alex




More information about the Python-list mailing list