Complexity of standard Python data structures (Re: Speed of string += string)

Mon Apr 14 10:15:12 EDT 2003

Marcus Alanen wrote:

> On Mon, 14 Apr 2003 10:53:34 GMT, Alex Martelli <aleax at aleax.it> wrote:
>>Marcus Alanen wrote:
>>> Indeed, that was the first thing I wondered about when I started
>>> learning Python.
>>Interesting!  What other languages (besides C++ for a subset of its
>>standard libraries) give O() specifications for their operations?
>>In other words, what other languages did you learn, before Python,
>>that didn't leave you wondering about such performance issues?

Maybe I should clarify that I'm not _attacking_ your statement (nor
you for making it, Marcus) -- I'm trying to _understand_ exactly
what's involved, to help me teach Python better, and/or write better
introductory material about it.  So, I'm thankful for the opportunity
this thread may pose to refine my understanding of such learning
and teaching issues.

> It must've been a combination of issues:
> 
> 1) Coming from a C/C++ background, I want to use data structures
> efficiently.  This belongs to the "premature optimization" group. So
> sue me.

While performing optimizations may be premature, _understanding_
what's going on can hardly be (if the person trying to understand
has sufficient cultural background -- and somebody who programs in
C++ is quite likely to have it, anyway).

However, I still don't understand why this would be a problem
specific to Python rather than any other language -- including
C++ for those operations that are not part of that subset of C++'s
standard library for which O()-behavior is specified in the standard.

What's the O()-behavior of C+'s dynamic_cast<>, for example, and
on what parameters does that performance behavior depend?  I do
not know, and I don't think I *CAN* know -- every implementation is
free to do its best (or its worst...) to optimize dynamic_cast's
performance, time-vs-space tradeoffs, whatever.

So why did you perceive Python as being different in this respect?
C doesn't specify the O()-behavior of, e.g., qsort (indeed it's
often pretty horrible...!) -- C++ doesn't specify performance for
such things as RTTI, exception handling, or even for that matter
for the dispatch of virtual methods (you COULD have a conforming
C++ implementation that dispatches in time depending on the depth
of the inheritance graph or the number of virtual methods, though
in practice I know of no such implementation) -- so why did Python
leave you wondering, where C and C++ didn't, when the (lack of)
standard constraints on operations' performance seems so similar?

> 2) The project I was doing could with time lead to handling quite big
> sets of data, so algorithmic efficiency was a concern.

This helps explain why you care to understand such issues (though
I believe that a desire for such understanding is quite normal even
if one is involved in rather different things at a givem time!),
but still doesn't help me understand why Python was different for
you, in this respect, from other languages.

> 3) Python's "list" had on insert() routine, but also a __getitem__
> routine. Both probably couldn't be O(1), so I got a bit confused.
> I didn't want to choose a "slow" method on purpose.

Yes, I understand this point, I think.

> 4) And hey, I did want to learn the language properly.

And definitely this one.

> Number 3 was quite irritating, since C++ "list" is a double-linked
> list, while Python "list" is what is called a "vector" in C++.

Yes, the terminology clash is nasty.  Indeed I remember being peeved
back when C++ started using "list" to mean a *doubly-linked* list
(rather than a singly listed one)...

> Probably not very good excuses (except number 4), but I'm just telling
> what I felt at the time. On the other hand, I don't understand why the
> Python documentation simply can't say "speed O(N)", "speed amortized
> O(1)" for the various methods.

I suspect the motivations are akin to those the C++ Standard (and just
about every other language standardization document) has for not giving
constraints for the performance of the various operations it's
describing -- keeping future changes open.  The C++ Standard is quite
exceptional in giving O()-constraints for SOME small subset of the
operations it describes (I think it's to Stepanov's credit that there
is even that exception) -- so you end up with no guarantees about the
O() behavior of the most fundamental operations BUT with such guarantees
about conceptually higher-level stuff in the library (something that
always amused me quite a bit).

But -- DO Python beginners typically start learning the language with
a *REFERENCE* manual?  Probably not -- yet you explain that the time
when you were left wondering was just when you STARTED learning.  So,
perhaps, the document to retouch with a few small mentions of typical
performance behavior (perhaps in a footnote or two) could be the
tutorial, rather than either of the reference manuals (language and
library).  Clearly, the tutorial is being descriptive (of one specific
implementation it comes with) and not prescriptive (it does not give
any constraints to what future implementations might choose), which
may make it more acceptable to give such indications there.

So what do you think -- is it the tutorial that such changes should
be targeting...?

In "Python in a Nutshell", I took a more traditional approach --
*assuming* that the target reader would want to understand the
*functionality* of operations first, the *performance
characteristics* of opeations only quite a bit later.  So, I
have descriptions of fundamental functionality quite early, in
chapter 4, with almost no mention of performance issues (except
a very few ones, such as the fact that pow(a,b,c) returns the
same result as (a**b)%c "but faster" -- or the fact that
passing a comparison function to the sort method of a list
"slows down the sort").  Only much later (in Chapter 17,
"Testing, Debugging, and Optimizing") do I cover the big-O
behavior of operations on lists, strings and dicts -- including
the key facts, from which almost all others follow, that lists
are internally implemented with vectors, dicts with hash tables.

Do you think that this traditional approach (teach functionality
first, performance issues only later) is fundamentally flawed?

That's basically what I see the "first thing I wondered about"
comment as potentially implying, and what I'm trying to get a
deeper understanding of, particularly when you seem to single
out Python wrt other languages in this respect.  Of course, in
the Nutshell I was constrained by other considerations too (it's
primarily a *reference* work, only secondarily intended to be
read from the start onwards -- having decided to devote a part
of a chapter to optimization, it seemed to me that optimization
related information was best grouped there, rather than spread
around -- and _repeating_ information would go against the grain
of what's meant above all to be a *concise* reference...).  But,
that book may well not be the last time I have occasion to write
about Python (or anyway teach it...) -- so, I think I need to
understand these issues more deeply.

>>> Others answered this very thoroughly, I'd just like to emphasize the
>>> linearity of keys() --- most often you can manage with .iterkeys() or
>>> .itervalues() just fine, and it's a lot faster.
> 
>>As for the performance issues, I wouldn't be so sure.  E.g.:
> 
> Sorry, I really didn't explain that very well. In several cases I have
> had, I don't need to traverse the whole array of dict(), so iterkeys()
> does feel like saving some memory, and possibly speed. Naturally, both
> functions in general visit all elements, and are linear in speed.

Interesting -- since both keys and iterkeys (or direct iteration on
the dict) walk the keys in "random" order, I'm not too sure what a
typical use case of interrupting the loop prematurely would be --
"find any key k that satisfies predicate(k)" for example?  That does
happen some of the time, of course, but I don't think of such cases
as typical.  In any case, it's entirely and always true that keys()
takes O(N) _memory_ while iterkeys() most definitely doesn't, so, _if
memory constraints are involved_, iterkeys, or direct iteration on
the dict object, are going to be preferable.

> By the way, I tested a similar program with 10 million elements a >15%
> speed gain with iterkeys():

That may be a case where "memory constraints are involved" -- allocating
and filling and walking a list with 40 MB of data is going to bust (at
least) any cache your CPU might have, while, if you're lucky, iterkeys
might have better locality and _some_ of the data involved might remain
in the cache.  (Incidentally, cache behavior and other such issues of
memory hierarchies are by far the most complicated performance aspects
to understand, analyze or predict -- and yet far too often they can at
the same time be _dominating_ the performance behavior of your program...
and no language spec is going to be very helpful there;-).

Alex