Classical FP problem in python : Hamming problem
Francis Girard
francis.girard at free.fr
Mon Jan 24 10:24:41 EST 2005
The following implementation is even more speaking as it makes self-evident
and almost mechanical how to translate algorithms that run after their tail
from recursion to "tee" usage :
*** BEGIN SNAP
from itertools import tee, imap
import sys
def imerge(xs, ys):
x = xs.next()
y = ys.next()
while True:
if x == y:
yield x
x = xs.next()
y = ys.next()
elif x < y:
yield x
x = xs.next()
else:
yield y
y = ys.next()
def hamming():
def _hamming():
yield 1
hamming2 = hammingGenerators[0]
hamming3 = hammingGenerators[1]
hamming5 = hammingGenerators[2]
for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
imerge(imap(lambda h: 3*h, iter(hamming3)),
imap(lambda h: 5*h, iter(hamming5)))):
yield n
hammingGenerators = tee(_hamming(), 4)
return hammingGenerators[3]
for i in hamming():
print i,
sys.stdout.flush()
*** END SNAP
Here's an implementation of the fibonacci sequence that uses "tee" :
*** BEGIN SNAP
from itertools import tee
import sys
def fib():
def _fib():
yield 1
yield 1
curGen = fibGenerators[0]
curAheadGen = fibGenerators[1]
curAheadGen.next()
while True:
yield curGen.next() + curAheadGen.next()
fibGenerators = tee(_fib(), 3)
return fibGenerators[2]
for n in fib():
print n,
sys.stdout.flush()
*** END SNAP
Francis Girard
FRANCE
Le lundi 24 Janvier 2005 14:09, Francis Girard a écrit :
> Ok, I think that the bottom line is this :
>
> For all the algorithms that run after their tail in an FP way, like the
> Hamming problem, or the Fibonacci sequence, (but unlike Sieve of
> Eratosthene -- there's a subtle difference), i.e. all those algorithms that
> typically rely upon recursion to get the beginning of the generated
> elements in order to generate the next one ...
>
> ... so for this family of algorithms, we have, somehow, to abandon
> recursion, and to explicitely maintain one or many lists of what had been
> generated.
>
> One solution for this is suggested in "test_generators.py". But I think
> that this solution is evil as it looks like recursion is used but it is not
> and it depends heavily on how the m235 function (for example) is invoked.
> Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It
> internally maintains a lists of generated results that grows forever while
> it is useless to maintain results that had been "consumed". Just for a
> gross estimate, on my system, percentage of memory usage for this program
> grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between
> 31%-36%. Ugly and inefficient.
>
> The solution of Jeff Epler is far more elegant. The "itertools.tee"
> function does just what we want. It internally maintain a list of what had
> been generated and deletes the "consumed" elements as the algo unrolls. To
> follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5
> minutes (probably only affected by the growing size of Huge Integer). CPU
> usage varies between 27%-29%.
> Beautiful and effecient.
>
> You might think that we shouldn't be that fussy about memory usage on
> generating 100 digits numbers but we're talking about a whole family of
> useful FP algorithms ; and the best way to implement them should be
> established.
>
> For this family of algorithms, itertools.tee is the way to go.
>
> I think that the semi-tutorial in "test_generators.py" should be updated to
> use "tee". Or, at least, a severe warning comment should be written.
>
> Thank you,
>
> Francis Girard
> FRANCE
>
> Le dimanche 23 Janvier 2005 23:27, Jeff Epler a écrit :
> > Your formulation in Python is recursive (hamming calls hamming()) and I
> > think that's why your program gives up fairly early.
> >
> > Instead, use itertools.tee() [new in Python 2.4, or search the internet
> > for an implementation that works in 2.3] to give a copy of the output
> > sequence to each "multiply by N" function as well as one to be the
> > return value.
> >
> > Here's my implementation, which matched your list early on but
> > effortlessly reached larger values. One large value it printed was
> > 6412351813189632 (a Python long) which indeed has only the distinct
> > prime factors 2 and 3. (2**43 * 3**6)
> >
> > Jeff
> >
> > from itertools import tee
> > import sys
> >
> > def imerge(xs, ys):
> > x = xs.next()
> > y = ys.next()
> > while True:
> > if x == y:
> > yield x
> > x = xs.next()
> > y = ys.next()
> > elif x < y:
> > yield x
> > x = xs.next()
> > else:
> > yield y
> > y = ys.next()
> >
> > def hamming():
> > def _hamming(j, k):
> > yield 1
> > hamming = generators[j]
> > for i in hamming:
> > yield i * k
> > generators = []
> > generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)),
> > _hamming(2, 5)) generators[:] = tee(generator, 4)
> > return generators[3]
> >
> > for i in hamming():
> > print i,
> > sys.stdout.flush()
More information about the Python-list
mailing list