Classical FP problem in python : Hamming problem

Fri Jan 28 02:53:56 EST 2005

Paul Rubin wrote:
> Francis Girard <francis.girard at free.fr> writes:
> 
>>Thank you Nick and Steven for the idea of a more generic imerge.
> 
> 
> If you want to get fancy, the merge should use a priority queue (like
> in the heapsort algorithm) instead of a linear scan through the
> incoming iters, to find the next item to output.  That lowers the
> running time to O(n log k) instead of O(n*k), where k is the number of
> iterators and n is the length.

I looked at a heapq solution but didn't see any clean way of dealing with 
multiple iterators having equal values.  The dict solution below deals cleanly 
with that, since one key can be shared by any number of iterators.  Extracting 
the minimum, and the associated iterables is fast, but the overall solution is 
still slower than the brute force approach for the 3 hamming iterators.

  >>> def imerge(*iterables):
  ...     cache = {}
  ...     iterators = map(iter,iterables)
  ...     number = len(iterables)
  ...     exhausted = 0
  ...     while 1:
              # First time through, looked at all of them
	     # Subsequently, update only the minimum iterators
  ...         for it in iterators:
  ...             try:
		     # Key each iterator by its next() value
		     # Multiple iterators may share the same key
  ...                 cache.setdefault(it.next(),[]).append(it)
  ...             except StopIteration:
  ...                 exhausted += 1
  ...                 if exhausted == number:
  ...                     raise StopIteration
	     # Get the lowest value
  ...         val = min(cache)
	     # and all the iterators that have that value
  ...         iterators = cache.pop(val)	
  ...         yield val

  >>> list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5]))
[1, 2, 3, 4, 5, 6, 7]
  >>>

Michael