[Python-ideas] Length hinting and preallocation for container types

Wed Mar 6 03:23:26 CET 2013

tl;dr - I'm glad that PyPy exists.  But IMHO Python shouldn't grow
APIs to manage memory.  Rather, compilers should take advantage of
existing internal APIs to do a better job of automatic management, and
compiler writers should suggest internal, not public, APIs where they
would help the compiler writers.

Eli Bendersky writes:
 > On Tue, Mar 5, 2013 at 9:05 AM, Christian Heimes <christian at python.org>wrote:
 > 
 > > Hello,
 > >
 > > today I came across this slides
 > > https://speakerdeck.com/alex/why-python-ruby-and-javascript-are-slow by
 > > Alex Gaynor. The slides aren't some random rants on Python. Alex makes
 > > some valid points. Coincidentally  he is a PyPy developer,
 > > too. ;)

Not at all coincidentally.  As a compiler writer (which he refers to
in the slides several times) he is offended by poor performance, as
measured in CPU/invocation.  Good for him!  I'm glad he's working on
PyPy!  But when compiler writers start talking language design for
performance, we inevitably end up with C<0.5 wink/>.

 > > One of his assertions is about memory (re)allocation. C is faster
 > > in some cases because C code usually does fewer allocations and
 > > reallocations. Python has no API to easily reallocate a list with
 > > 100 items.

And it shouldn't.<wink/>

But why do you think allocation is slow in the general case?  Sure, it
involves system calls which indeed do slow things down.  But there are
ways to arrange that those calls are amortized (eg, preallocation of
arenas, exponential reallocation for fast-growing objects).  Yes, the
few instructions it takes to grab a slab of memory off the heap add
up.  But is it worth programmer time to actually estimate correctly
the size of the list in advance?  What happens if the code is reused
in an application where the lists are two orders of magnitude larger?
Won't performance go in the tank?  (According to your benchmark, I
don't think anybody would ever notice.)

 > > Code like
 > >
 > >   lst = []
 > >   for i in range(100):
 > >       lst.append(i*i)

shouldn't be written.<wink/>  As Eli writes:

 > The real problem is that this code is not idiomatic Python[...].

+as-many-as-I've-got-in-my-pocket.  It's "for s in strings: s0 += s"
in another guise.  "We gots idioms fo' dat!"

 > > Internally CPython already distinguishes between the length of
 > > object and the allocation size of an object for some types like
 > > list, dict and bytearray. For example PyListObject has `ob_size`
 > > for __len__ and `allocated` for the amount of available `ob_item`
 > > slots.

Good.  Are we done now?<wink/>  Python has the necessary features, let
the compiler writers take advantage of them.  But exposing them to the
user and requiring that the user do ugly and fragile things to get a
minor speedup isn't Pythonic.

 > > with LengthHint(list, 200) as lst:
 > >     # lst has 200 ob_item slots but len(lst) == 0
 > >     for i in range(100):
 > >         lst.append(i*i)
 > > # __exit__ shrinks ob_item to 100

But that's ugly, especially the literal "200".  The fact that Python
doesn't splat explicit memory management in our T-shirts is one of the
reasons why we use Python.  And Python does grow extensions which
optimize such patterns, beautiful features at that.  Back to Eli:

 > [If] you want it to be reasonably fast [...] why not:
 > 
 > lst = [i*i for i in range(100)]

Again:

Alex complains that "objects may be dicts, but dicts aren't objects".
The performance point is that dicts are slow.  True.  But again Python
(not all implementations?) has an optimization (though not pretty):
objects can have slots.  It also has a (potentially horrible!) 
pessimization: properties.  The point is that object member access is
often the pretty way to code.  If it needs to be fast, we can do
that.  If it needs to be small or obey DRY, we can do that, too.