list.pop(0) vs. collections.dequeue

Steve Howell showell30 at yahoo.com
Mon Jan 25 18:30:16 EST 2010


On Jan 25, 1:32 pm, Arnaud Delobelle <arno... at googlemail.com> wrote:
> Steve Howell <showel... at yahoo.com> writes:
>
> [...]
>
> > My algorithm does exactly N pops and roughly N list accesses, so I
> > would be going from N*N + N to N + N log N if switched to blist.
>
> Can you post your algorithm?  It would be interesting to have a concrete
> use case to base this discussion on.
>

These are the profile results for an admittedly very large file
(430,000 lines), which shows that pop() consumes more time than any
other low level method.  So pop() is not a total red herring.  But I
have to be honest and admit that I grossly overestimated the penalty
for smaller files.  Typical files are a couple hundred lines, and for
that use case, pop()'s expense gets totally drowned out by regex
handling.  In other words, it's a lot cheaper to move a couple hundred
pointers per list element pop than it is to apply a series of regexes
to them, which shouldn't be surprising.

   ncalls  tottime  percall  cumtime  percall filename:lineno
(function)
 230001/1  149.508    0.001  222.432  222.432 /home/showell/workspace/
shpaml_website/shpaml.py:192(recurse)
   429999   17.667    0.000   17.667    0.000 {method 'pop' of 'list'
objects}
   530000    8.428    0.000   14.125    0.000 /home/showell/workspace/
shpaml_website/shpaml.py:143(get_indented_block)
  3700008    7.877    0.000    7.877    0.000 {built-in method match}
5410125/5410121    5.697    0.000    5.697    0.000 {len}
   300000    3.938    0.000   22.286    0.000 /home/showell/workspace/
shpaml_website/shpaml.py:96(convert_line)
   959999    3.847    0.000    6.759    0.000 /home/showell/workspace/
shpaml_website/shpaml.py:29(INDENT)
   959999    3.717    0.000   12.547    0.000 /home/showell/workspace/
shpaml_website/shpaml.py:138(find_indentation)
   370000    3.495    0.000   20.204    0.000 /home/showell/workspace/
shpaml_website/shpaml.py:109(apply_jquery)
   370000    3.322    0.000    6.528    0.000 {built-in method sub}
  1469999    2.575    0.000    2.575    0.000 {built-in method groups}

As an aside, I am a little surprised by how often I call len() and
that it also takes a large chunk of time, but that's my problem to
fix.




More information about the Python-list mailing list