even faster heaps

srinivas devaki mr.eightnoteight at gmail.com
Tue Mar 8 02:12:41 EST 2016


Hi,
sorry i didn't get to your last mail as i'm having exams at that time.

Everything is good so far.

but if at all the purpose is to add features to the stdlib heap and
keeping the speed offered by the stdlib, why aren't you allowing the
duplicate elements which is possible with stdlib heap.

to be able to add duplicate elements you just have to make the set as
a dict with counter as the value. afterall it is highly common to have
priority task scheduling with equal priorities to the tasks.

as far as the benchmarks are concerned it didn't change much with the
current dataset. but it is not valid to compare them as the current
dataset is intended for unique elements. but with the dict the dataset
can contain elements which have equal keys.

on my machine with 10 repetitions in the test_xheap_time.py
** results for the code which supports duplicate elements
HEAD at github.com/eightnoteight/xheap **

(u'init', u'heapq      ', u' 0.45 ( 1.00x)  4.59 ( 1.00x) 52.24 (
1.00x) 553.80 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.48 ( 1.07x)  5.41 ( 1.18x) 75.89 (
1.45x) 786.46 ( 1.42x)')
(u'    ', u'RemovalHeap', u' 0.86 ( 1.93x)  9.98 ( 2.18x) 131.28 (
2.51x) 1364.14 ( 2.46x)')
--------------------------------------------------------------------
(u'pop ', u'heapq      ', u' 0.09 ( 1.00x)  1.27 ( 1.00x) 15.64 (
1.00x) 184.76 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.10 ( 1.07x)  1.33 ( 1.05x) 16.30 (
1.04x) 191.43 ( 1.04x)')
(u'    ', u'RemovalHeap', u' 0.15 ( 1.64x)  1.87 ( 1.47x) 21.33 (
1.36x) 242.31 ( 1.31x)')
--------------------------------------------------------------------
(u'push', u'heapq      ', u' 0.04 ( 1.00x)  0.33 ( 1.00x)  3.53 (
1.00x) 32.99 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.05 ( 1.20x)  0.41 ( 1.25x)  4.37 (
1.24x) 44.14 ( 1.34x)')
(u'    ', u'RemovalHeap', u' 0.06 ( 1.58x)  0.56 ( 1.70x)  5.85 (
1.66x) 59.59 ( 1.81x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'init', u'heapq    ', u' 0.45 ( 1.00x)  8.00 ( 1.00x) 109.50 (
1.00x) 898.50 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.50 ( 1.10x)  8.75 ( 1.09x) 112.63 (
1.03x) 1108.26 ( 1.23x)')
(u'    ', u'XHeap    ', u' 0.93 ( 2.06x) 13.88 ( 1.74x) 170.97 (
1.56x) 1709.94 ( 1.90x)')
--------------------------------------------------------------------
(u'pop ', u'heapq    ', u' 0.04 ( 1.00x)  0.54 ( 1.00x)  6.50 ( 1.00x)
76.64 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.06 ( 1.73x)  0.80 ( 1.49x)  9.16 ( 1.41x)
101.68 ( 1.33x)')
(u'    ', u'XHeap    ', u' 0.10 ( 2.65x)  1.14 ( 2.13x) 12.60 ( 1.94x)
137.73 ( 1.80x)')
--------------------------------------------------------------------
(u'push', u'heapq    ', u' 0.01 ( 1.00x)  0.17 ( 1.00x)  1.86 ( 1.00x)
14.85 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.04 ( 3.88x)  0.47 ( 2.86x)  4.80 ( 2.58x)
42.98 ( 2.89x)')
(u'    ', u'XHeap    ', u' 0.04 ( 3.79x)  0.47 ( 2.85x)  4.85 ( 2.61x)
43.94 ( 2.96x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'remove', u'RemovalHeap', u' 0.07 ( 1.00x)  0.73 ( 1.00x)  7.38 (
1.00x) 75.18 ( 1.00x)')
(u'      ', u'XHeap      ', u' 0.05 ( 0.79x)  0.55 ( 0.75x)  5.52 (
0.75x) 55.08 ( 0.73x)')
--------------------------------------------------------------------
--------------------------------------------------------------------


** benchmark for current git HEAD d0707fba2401a3cef8aed54028fe6d7e9497faa5 **


(u'init', u'heapq      ', u' 0.49 ( 1.00x)  5.03 ( 1.00x) 58.71 (
1.00x) 600.91 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.51 ( 1.05x)  5.83 ( 1.16x) 82.88 (
1.41x) 886.22 ( 1.47x)')
(u'    ', u'RemovalHeap', u' 0.58 ( 1.19x)  7.19 ( 1.43x) 106.80 (
1.82x) 1108.52 ( 1.84x)')
--------------------------------------------------------------------
(u'pop ', u'heapq      ', u' 0.10 ( 1.00x)  1.41 ( 1.00x) 17.64 (
1.00x) 209.82 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.11 ( 1.07x)  1.47 ( 1.04x) 18.27 (
1.04x) 215.14 ( 1.03x)')
(u'    ', u'RemovalHeap', u' 0.15 ( 1.52x)  1.91 ( 1.35x) 22.64 (
1.28x) 258.68 ( 1.23x)')
--------------------------------------------------------------------
(u'push', u'heapq      ', u' 0.04 ( 1.00x)  0.32 ( 1.00x)  3.49 (
1.00x) 33.92 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.05 ( 1.18x)  0.39 ( 1.22x)  4.21 (
1.20x) 42.03 ( 1.24x)')
(u'    ', u'RemovalHeap', u' 0.06 ( 1.52x)  0.52 ( 1.62x)  5.60 (
1.60x) 56.54 ( 1.67x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'init', u'heapq    ', u' 0.44 ( 1.00x)  7.92 ( 1.00x) 106.52 (
1.00x) 915.20 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.50 ( 1.14x)  8.67 ( 1.10x) 111.99 (
1.05x) 1129.89 ( 1.23x)')
(u'    ', u'XHeap    ', u' 0.61 ( 1.38x) 10.75 ( 1.36x) 140.86 (
1.32x) 1417.84 ( 1.55x)')
--------------------------------------------------------------------
(u'pop ', u'heapq    ', u' 0.04 ( 1.00x)  0.55 ( 1.00x)  6.59 ( 1.00x)
76.81 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.06 ( 1.68x)  0.79 ( 1.43x)  9.04 ( 1.37x)
101.72 ( 1.32x)')
(u'    ', u'XHeap    ', u' 0.09 ( 2.43x)  1.03 ( 1.85x) 11.48 ( 1.74x)
125.94 ( 1.64x)')
--------------------------------------------------------------------
(u'push', u'heapq    ', u' 0.01 ( 1.00x)  0.16 ( 1.00x)  1.85 ( 1.00x)
14.65 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.04 ( 4.32x)  0.46 ( 2.81x)  4.74 ( 2.56x)
42.25 ( 2.88x)')
(u'    ', u'XHeap    ', u' 0.03 ( 3.73x)  0.42 ( 2.55x)  4.34 ( 2.35x)
38.37 ( 2.62x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'remove', u'RemovalHeap', u' 0.05 ( 1.00x)  0.54 ( 1.00x)  5.62 (
1.00x) 57.04 ( 1.00x)')
(u'      ', u'XHeap      ', u' 0.04 ( 0.88x)  0.44 ( 0.80x)  4.41 (
0.78x) 43.86 ( 0.77x)')
--------------------------------------------------------------------
--------------------------------------------------------------------


So as the results are not much effected apart of __init__, i think you
should consider this.

Note: i'm not using collections.Counter because it is written in
python, and from my previous experience it is slower than using
defaultdict for this kind of purposes.

ps: there are two error's when i ran tests with test_xheap. OMG why
did you keep repititions = 10000, at first i though my pentium laptop
is too slow that it is not printing a single line even after 20
minutes. then i saw the number of computations are in the order of
10**10, how many days did it took to completely run the tests?


On Sun, Mar 6, 2016 at 7:29 PM, Sven R. Kunze <srkunze at mail.de> wrote:
> Hi python-list, hi Srinivas,
>
> I managed to implement the mark&sweep approach for fast removal from heaps.
> This way, I got three pleasant results:
>
> 1) a substantial speed up!
> 2) an improved testsuite
> 3) discovery and fixing of several bugs
>
> @Srinivas I would be honored if you could have a look at the implementation:
> https://github.com/srkunze/xheap . After all, it was your idea. I only
> perform the sweeping step during pop and remove with the condition of yours.
> :)
>
> Using the original xheap benchmark, I could see huge speedups: from 50x/25x
> down to 3x/2x compared to heapq. That's a massive improvement. I will
> publish an update soon.
>
> Best,
> Sven



-- 
Regards
Srinivas Devaki
Junior (3rd yr) student at Indian School of Mines,(IIT Dhanbad)
Computer Science and Engineering Department
ph: +91 9491 383 249
telegram_id: @eightnoteight



More information about the Python-list mailing list