Merging sorted lists/iterators/generators into one stream of values...
Lasse Vågsæther Karlsen
lasse at vkarlsen.no
Tue Oct 11 07:12:06 EDT 2005
<snip>
Another idea for this method would be that in some cases I noticed that
it was useful to know which source each element would come from as well,
as well as removing duplicates from the results.
For instance
s1 = [1, 3, 5, 7]
s2 = [2, 3, 4]
for k, s in merge_by_sort(s1, s2):
print k, "from source", s
this would print:
1 from source 0
2 from source 1
3 from source 1
3 from source 0
4 from source 1
5 from source 0
7 from source 0
and the above list has 3 twice, so possibly:
1 from sources [0]
2 from sources [1]
3 from sources [0, 1]
4 from sources [1]
5 from sources [0]
7 from sources [0]
This latter one would be a slightly more heavy method as it would have
to compare the N first elements of the list or heap to figure out what
indices to yield as well.
However, the previous solution could be:
def merge_by_sort(*sources, **options):
if "cmp" in options:
comparison = options["cmp"]
else:
comparison = cmp
iterables = []
for index, source in enumerate(sources):
try:
source = iter(source)
iterables.append([source.next(), index, source])
except StopIteration:
pass
iterables.sort(cmp=comparison, key=lambda x: x[0], reverse=True)
while iterables:
yield iterables[-1][0], iterables[-1][1]
try:
iterables[-1][0] = iterables[-1][2].next()
if len(iterables) > 1 and comparison(iterables[-1][0],
iterables[-2][0]) > 0:
iterables.sort(comparison, key=lambda x: x[0],
reverse=True)
except StopIteration:
iterables.pop(-1)
--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:lasse at vkarlsen.no
PGP KeyID: 0x2A42A1C2
More information about the Python-list
mailing list