Merging sorted lists/iterators/generators into one stream of values...

Mike C. Fletcher mcfletch at rogers.com
Mon Oct 10 09:28:36 EDT 2005


Alex Martelli wrote:

>George Sakkis <gsakkis at rutgers.edu> wrote:
>   ...
>  
>
>>>manipulation of a heap to place an item in the right spot, but with 4-5
>>>or a few more sources might not make an impact at all.
>>>      
>>>
>>Unless you're talking about hundreds or thousands sources, it probably
>>won't. I would still go for the heap solution since IMO the resulting
>>code it's more readable and easier to understand.
>>    
>>
...

>i.e., a heap solution may be over 4 times faster than a sort-based one
>(in the following implementations).  Readability seems quite comparable
>(skipping the rest of the infrastructure, which generates random sorted
>streams and ensures a stream is exhausted and verifies it etc etc):
>  
>
One thing to keep in mind (if you care about performance) is that you 
one could use bisect, instead of sort, as the sorted list of streams is 
already in order save for the one element you are processing.  Btw, nice 
trick with reverse to reduce memory copying, when did that get 
introduced?  Wonder if bisect can deal with reverse-sorted elements.  
Anyway, should reduce the O-complexity of that part of the operation, 
though you'll still have to do a memcpy to shift the rest of the source 
list's array, and if it can't deal with reverse-sorted lists it would 
move you back to front-of-list popping.

Oh, we're still missing use of a comparison function in both versions. 
I'd think you'd want that functionality *available* if you're going to 
make this a general tool.  You'll also need to check for StopIteration 
on creation of sources for null sequences.  Finally, why  the 'i' 
element?  It's never used AFAICS.

>def merge_by_sort(streams):
>  sources = [[s.next(), i, s.next] for i, s in enumerate(streams)]
>  while sources:
>    sources.sort(reverse=True)
>    best_source = sources[-1]
>    yield best_source[0]
>    try: best_source[0] = best_source[-1]()
>    except StopIteration: sources.pop()
>  
>
...

Have fun,
Mike

-- 
________________________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com




More information about the Python-list mailing list