Classical FP problem in python : Hamming problem

Francis Girard francis.girard at free.fr
Mon Jan 24 10:24:41 EST 2005


The following implementation is even more speaking as it makes self-evident 
and almost mechanical how to translate algorithms that run after their tail 
from recursion to "tee" usage :

*** BEGIN SNAP
from itertools import tee, imap
import sys

def imerge(xs, ys):
  x = xs.next()
  y = ys.next()
  while True:
    if x == y:
      yield x
      x = xs.next()
      y = ys.next()
    elif x < y:
      yield x
      x = xs.next()
    else:
      yield y
      y = ys.next()

      
def hamming():
  def _hamming():
    yield 1
    hamming2 = hammingGenerators[0]
    hamming3 = hammingGenerators[1]
    hamming5 = hammingGenerators[2]
    for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
                    imerge(imap(lambda h: 3*h, iter(hamming3)),
                           imap(lambda h: 5*h, iter(hamming5)))):
      yield n
  hammingGenerators = tee(_hamming(), 4)
  return hammingGenerators[3]
      
for i in hamming():
  print i,
  sys.stdout.flush()
*** END SNAP

Here's an implementation of the fibonacci sequence that uses "tee" : 

*** BEGIN SNAP
from itertools import tee
import sys

def fib():
  def _fib():
    yield 1
    yield 1
    curGen = fibGenerators[0]
    curAheadGen = fibGenerators[1]
    curAheadGen.next()
    while True:
      yield curGen.next() + curAheadGen.next()
  fibGenerators = tee(_fib(), 3)
  return fibGenerators[2]
  
for n in fib():
  print n,
  sys.stdout.flush()
*** END SNAP

Francis Girard
FRANCE


Le lundi 24 Janvier 2005 14:09, Francis Girard a écrit :
> Ok, I think that the bottom line is this :
>
> For all the algorithms that run after their tail in an FP way, like the
> Hamming problem, or the Fibonacci sequence, (but unlike Sieve of
> Eratosthene -- there's a subtle difference), i.e. all those algorithms that
> typically rely upon recursion to get the beginning of the generated
> elements in order to generate the next one ...
>
> ... so for this family of algorithms, we have, somehow, to abandon
> recursion, and to explicitely maintain one or many lists of what had been
> generated.
>
> One solution for this is suggested in "test_generators.py". But I think
> that this solution is evil as it looks like recursion is used but it is not
> and it depends heavily on how the m235 function (for example) is invoked.
> Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It
> internally maintains a lists of generated results that grows forever while
> it is useless to maintain results that had been "consumed". Just for a
> gross estimate, on my system, percentage of memory usage for this program
> grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between
> 31%-36%. Ugly and inefficient.
>
> The solution of Jeff Epler is far more elegant. The "itertools.tee"
> function does just what we want. It internally maintain a list of what had
> been generated and deletes the "consumed" elements as the algo unrolls. To
> follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5
> minutes (probably only affected by the growing size of Huge Integer). CPU
> usage varies between 27%-29%.
> Beautiful and effecient.
>
> You might think that we shouldn't be that fussy about memory usage on
> generating 100 digits numbers but we're talking about a whole family of
> useful FP algorithms ; and the best way to implement them should be
> established.
>
> For this family of algorithms, itertools.tee is the way to go.
>
> I think that the semi-tutorial in "test_generators.py" should be updated to
> use "tee". Or, at least, a severe warning comment should be written.
>
> Thank you,
>
> Francis Girard
> FRANCE
>
> Le dimanche 23 Janvier 2005 23:27, Jeff Epler a écrit :
> > Your formulation in Python is recursive (hamming calls hamming()) and I
> > think that's why your program gives up fairly early.
> >
> > Instead, use itertools.tee() [new in Python 2.4, or search the internet
> > for an implementation that works in 2.3] to give a copy of the output
> > sequence to each "multiply by N" function as well as one to be the
> > return value.
> >
> > Here's my implementation, which matched your list early on but
> > effortlessly reached larger values.  One large value it printed was
> > 6412351813189632 (a Python long) which indeed has only the distinct
> > prime factors 2 and 3. (2**43 * 3**6)
> >
> > Jeff
> >
> > from itertools import tee
> > import sys
> >
> > def imerge(xs, ys):
> >     x = xs.next()
> >     y = ys.next()
> >     while True:
> >         if x == y:
> >             yield x
> >             x = xs.next()
> >             y = ys.next()
> >         elif x < y:
> >             yield x
> >             x = xs.next()
> >         else:
> >             yield y
> >             y = ys.next()
> >
> > def hamming():
> >     def _hamming(j, k):
> >         yield 1
> >         hamming = generators[j]
> >         for i in hamming:
> >             yield i * k
> >     generators = []
> >     generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)),
> > _hamming(2, 5)) generators[:] = tee(generator, 4)
> >     return generators[3]
> >
> > for i in hamming():
> >     print i,
> >     sys.stdout.flush()




More information about the Python-list mailing list