list.pop(0) vs. collections.dequeue

Steve Howell showell30 at yahoo.com
Sun Jan 24 14:26:32 EST 2010


On Jan 23, 3:04 pm, Terry Reedy <tjre... at udel.edu> wrote:
> On 1/23/2010 12:17 PM, Steve Howell wrote:
>
> > Terry Reedy said:
>
> > '''
> > If you try writing a full patch, as I believe someone did, or at least
> > a
> > prototype thereof, when the idea was discussed, you might have a
> > better
> > idea of what the tradeoffs are and why it was rejected.
> > '''
>
> > I have to run, but tomorrow I will try to dig through python-dev
> > archives and find the patch.  If anybody has hints on where to look
> > for it (anybody remember the author, for example?), it would be much
> > appreciated.
>
> The approach you outlined in your other response to me is, I believe,
> what was considered, investigated, and then rejected (by Guido, with
> agreement). The discussion may have been on the now-closed and
> (misspelled) pyk3 (?), or maybe on python-ideas, but my memory is more
> likely the former. I am sure that Raymond H. was involved also.
>
> > If the patch looks simple, I will try to pitch the idea that its time
> > has come.  Now that the specification of the language itself is
> > frozen, I think there might be more room for improving
> > implementations.  Also, I might be able to make the argument that
> > tradeoffs of memory vs. CPU vs. code complexity have different forces
> > in the 2010s.
>
> I am not opposed to a possible change, just hasty, ill-informed
> criticism. If there is not a PEP on this issue, it would be good to have
> one that recorded the proposal and the pros and cons, regardless of the
> outcome, so there would be something to refer people to. If that had
> been already done, it would have shortened this thread considerably.
>

I think it's a good idea to write a PEP on this issue, and I will
attempt a first draft.  I think I should submit the first draft to
python-ideas, correct?

I expect the PEP to be at least initially, if not permanently,
rejected, but it would not be an exercise in futility, as I agree it's
good to record pros and cons of the proposal in one place.  The PEP
probably would not include a proposed patch until there was a little
bit of consensus behind it, but it would not take me a lot of time to
present the basic argument.

Here is my sketch of what the PEP would look like.

Proposal: Improve list's implementation so that deleting elements from
the front of the list does not require an O(N) memmove operation.

Rationale: Some Python programs that process lists have multiple
methods that consume the first element of the list and pop it off.
The pattern comes up with parsers in particular, but there are other
examples.  It is possible now, of course, to use a data structure in
Python that has O(1) for deleting off the top of the list, but none of
the alternatives fully replicate the benefits of list itself.

Specification: Improving CPython's performance does not affect the
language itself, so there are no bikeshed arguments to be had with
respect to syntax, etc.  Any patch would, of course, affect the
performance of nearly every Python program in existence, so any patch
would have to, at a bare minimum:

  1) Not increase the time or memory complexity of any other list
operation.
  2) Not affect list access at all.
  3) Minimally affect list operations that mutate the list.
  4) Be reasonably simple within CPython itself.
  5) Not be grossly wasteful of memory.

Backwards Compatibility:

See above.  An implementation of this PEP would not change the
definition of the language in any way, but it would have to minimally
impact the performance of lists for the normal use cases.

Implementation:

There are two ways to make deleting the first item of the list run
more efficiently.

The most ambitious proposal is to fix the memory manager itself to
allow the release of memory from the start of the chunk.  The
advantage of this proposal is that it would simplify the changes to
list itself, and possibly have collateral benefits for other CPython
internal data structures.  The disadvantage of the proposal is that
there is a strong tradition in CPython to use native memory
management, particularly with respect to the fact that it runs on many
platforms.

The less ambitious proposal is to change the memory management scheme
within list itself.  There is already precedent in list_resize() to
optimistically allocate memory, so it is not a great break from
tradition to optimistically defer the release of memory.  But it would
complicate things.

References:

I would refer to this thread on comp.lang.python for discussion, and I
would also try to dig up older threads on python-dev or elsewhere.






More information about the Python-list mailing list