Merging multiple sorted sequences.

Ian Kelly ian.g.kelly at gmail.com
Wed Apr 12 18:44:07 EDT 2017


On Wed, Apr 12, 2017 at 4:15 PM, Erik <python at lucidity.plus.com> wrote:
> Hi.
>
> I need to be able to lazily merge a variable number of already-sorted(*)
> variable-length sequences into a single sorted sequence. The merge should
> continue until the longest sequence has been exhausted.
>
> (*) They may in practice be a lazy source of data known to only ever be
> generated in an order that's "sorted" from the POV of this function.
>
> I have the following - can anyone think of any improvements (other than
> indentation style on the try/excepts ;) )? This seems like the sort of thing
> for which there might be a built-in or recipe that I've missed.

There probably should be. I know I've implemented something similar in the past.

> -----------------------------------------
> def merge(*sources):
>     items = []
>     for source in (iter(s) for s in sources):
>        try: items.append([next(source), source])
>        except StopIteration: pass
>
>     while len(items) > 1:
>         items.sort(key=lambda item: item[0])

This might be okay since Timsort on an already-sorted list should be
O(n). But there's not really any need to keep them sorted and I would
just use "lowest = min(items, key=itemgetter(0))".

>         lowest = items[0]
>         yield lowest[0]
>
>         try: lowest[0] = next(lowest[1])
>         except StopIteration: del items[0]
>
>     if len(items) != 0:

"if items:"

>        yield items[0][0]
>        yield from items[0][1]
> -----------------------------------------



More information about the Python-list mailing list