_siftup and _siftdown implementation

Sven R. Kunze srkunze at mail.de
Fri Feb 5 09:42:29 EST 2016


On 05.02.2016 02:26, srinivas devaki wrote:
> as I come to think of it again, it is not subheap, it actually heap cut at
> some level hope you get the idea from the usage of _siftup. so even though
> the `pos` children are valid the _siftup brings down the new element (i.e
> the element which is at first at `pos`) upto its leaf level and then again
> it is brought up by using _siftdown. why do the redundant work when it can
> simply breakout?

The heapq module itself has a very extensive documentation inside. This 
is what it says for _siftup. I think this is sort of an optimization 
that works pretty well (cf. the numbers) for popping off the FIRST item:

"""

# The child indices of heap index pos are already heaps, and we want to 
make # a heap at index pos too. We do this by bubbling the smaller child 
of # pos up (and so on with that child's children, etc) until hitting a 
leaf, # then using _siftdown to move the oddball originally at index pos 
into place. # # We *could* break out of the loop as soon as we find a 
pos where newitem <= # both its children, but turns out that's not a 
good idea, and despite that # many books write the algorithm that way. 
During a heap pop, the last array # element is sifted in, and that tends 
to be large, so that comparing it # against values starting from the 
root usually doesn't pay (= usually doesn't # get us out of the loop 
early). See Knuth, Volume 3, where this is # explained and quantified in 
an exercise. # # Cutting the # of comparisons is important, since these 
routines have no # way to extract "the priority" from an array element, 
so that intelligence # is likely to be hiding in custom comparison 
methods, or in array elements # storing (priority, record) tuples. 
Comparisons are thus potentially # expensive. # # On random arrays of 
length 1000, making this change cut the number of # comparisons made by 
heapify() a little, and those made by exhaustive # heappop() a lot, in 
accord with theory. Here are typical results from 3 # runs (3 just to 
demonstrate how small the variance is): # # Compares needed by heapify 
Compares needed by 1000 heappops # -------------------------- 
-------------------------------- # 1837 cut to 1663 14996 cut to 8680 # 
1855 cut to 1659 14966 cut to 8678 # 1847 cut to 1660 15024 cut to 8703 
# # Building the heap by using heappush() 1000 times instead required # 
2198, 2148, and 2219 compares: heapify() is more efficient, when # you 
can use it. # # The total compares needed by list.sort() on the same 
lists were 8627, # 8627, and 8632 (this should be compared to the sum of 
heapify() and # heappop() compares): list.sort() is (unsurprisingly!) 
more efficient # for sorting.

"""

What do you think about our use-case?

> _siftup and _siftdown are functions from python standard heapq module.
>
> PS: I do competitive programming, I use these modules every couple of days
> when compared to other modules. so didn't give much thought when posting to
> the mailing list. sorry for that.

Competitive programming? That sounds interesting. :)

Best,
Sven



More information about the Python-list mailing list