Merging sorted lists/iterators/generators into one stream of values...
Mike C. Fletcher
mcfletch at rogers.com
Thu Oct 6 20:05:01 EDT 2005
Lasse Vågsæther Karlsen wrote:
>I need to merge several sources of values into one stream of values. All
>of the sources are sorted already and I need to retrieve the values from
>them all in sorted order.
>
>In other words:
>s1 = [10, 20, 30, 40, 50]
>s2 = [15, 25]
>s3 = [17, 27, 37]
>
>for value in ???(s1, s2, s3):
> print value
>
>will print out 10, 15, 17, 20, 25, 27, 30, 37, 40, 50 in that order.
>
>The sources are cursors retrieving data from several databases, not from
>the same server, and there's a potential for a large number of rows from
>several of the sources. As such, any method that would load it all into
>memory and sort it is no good as it would too much memory.
>
>Is there a function or whatnot in Python that will do what I want? I
>have come up with my own method but since it uses generators I'm not
>sure how much overhead there is. Additionally, since the values
>retrieved from the cursors will be dictionaries of "fieldname":value
>pairs, the method would either need to handle that construct (and be
>told what fieldname to sort on), or be able to take a function object to
>use for the comparison operation.
>
>Basically, I'll use my own function for this unless someone can point me
>to something that is already available. Couldn't seem to find anything
>in the builtin modules but I didn't find glob.glob when I was looking
>for how to find filenames either so who knows what it's called :)
>
>Since I need to deal with a variable list of sources (that is, 2 in one
>application, 3 in another and 4-5 in a third), a simple 2-source method
>isn't enough but if it's better than what I do for 2 sources then I can
>make a wrapper for it, since that's what I do anyway.
>
>
I doubt you'll find a prebuilt one, it's fairly easy to create your own,
after all. Generators are fairly fast constructs in Python, btw.
Here's what I whipped up in a few minutes:
def firstIter( value ):
it = iter( value )
try:
return it.next(), it
except StopIteration, err:
return None, None
def inorder( comparision, *args ):
iterators = [
[value,it]
for (value,it) in [ firstIter( arg ) for arg in args ]
if it is not None
]
iterators.sort()
while iterators:
yield iterators[0][0]
try:
value = iterators[0][0] = iterators[0][1].next()
except StopIteration, err:
iterators.pop(0)
else:
if len(iterators) > 1 and comparision( value,
iterators[1][0]) == 1:
iterators.sort()
continue
if __name__ == "__main__":
s1 = [10, 20, 30, 40, 50]
s2 = [15, 25]
s3 = [17, 27, 37]
s4 = []
for value in inorder(cmp, s1, s2, s3, s4):
print value
Anyway, have fun,
Mike
--
________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
More information about the Python-list
mailing list