Heap Implementation

Cem Karan cfkaran2 at gmail.com
Thu Feb 11 06:06:00 EST 2016


On Feb 10, 2016, at 1:23 PM, "Sven R. Kunze" <srkunze at mail.de> wrote:

> Hi Cem,
> 
> On 08.02.2016 02:37, Cem Karan wrote:
>> My apologies for not writing sooner, but work has been quite busy lately (and likely will be for some time to come).
> 
> no problem here. :)
> 
>> I read your approach, and it looks pretty good, but there may be one issue with it; how do you handle the same item being pushed into the heap more than once?  In my simple simulator, I'll push the same object into my event queue multiple times in a row.  The priority is the moment in the future when the object will be called.  As a result, items don't have unique priorities.  I know that there are methods of handling this from the client-side (tuples with unique counters come to mind), but if your library can handle it directly, then that could be useful to others as well.
> 
> I've pondered about that in the early design phase. I considered it a slowdown for my use-case without benefit.
> 
> Why? Because I always push a fresh object ALTHOUGH it might be equal comparing attributes (priority, deadline, etc.).
> 
> 
> That's the reason why I need to ask again: why pushing the same item on a heap?
> 
> 
> Are we talking about function objects? If so, then your concern is valid. Would you accept a solution that would involve wrapping the function in another object carrying the priority? Would you prefer a wrapper that's defined by xheap itself so you can just use it?

Yes.  I use priority queues for event loops.  The items I push in are callables (sometimes callbacks, sometimes objects with __call__()) and the priority is the simulation date that they should be called.  I push the same item multiple times in a row because it will modify itself by the call (e.g., the location of an actor is calculated by its velocity and the date). There are certain calls that I tend to push in all at once because the math for calculating when the event should occur is somewhat expensive to calculate, and always returns multiple dates at once.  

That is also why deleting or changing events can be useful; I know that at least some of those events will be canceled in the future, which makes deleting useful.  Note that it is also possible to cancel an event by marking it as cancelled, and then simply not executing it when you pop it off the queue, but I've found that there are a few cases in my simulations where the number of dead events that are in the queue exceeds the number of live events, which does have an impact on memory and operational speed (maintaining the heap invariant).  There isn't much difference though, but I need FAST code to deal with size of my simulations (thousands to tens of thousands of actors, over hundreds of millions of simulations, which is why I finally had to give up on python and switch to pure C).

Having a wrapper defined by xheap would be ideal; I suspect that I won't be the only one that needs to deal with this, so having it centrally located would be best.  It may also make it possible for you to optimize xheap's behavior in some way.

Thanks,
Cem Karan


More information about the Python-list mailing list