multirember&co

James Stroud jstroud at mbi.ucla.edu
Tue Apr 17 00:11:57 EDT 2007


bearophileHUGS at lycos.com wrote:
> Once in while I too have something to ask. This is a little problem
> that comes from a Scheme Book (I have left this thread because this
> post contains too much Python code for a Scheme newsgroup):
> http://groups.google.com/group/comp.lang.scheme/browse_thread/thread/a059f78eb4457d08/
> 
> The function multiremberandco is hard (for me still) if done in that
> little Scheme subset, but it's very easy with Python. It collects two
> lists from a given one, at the end it applies the generic given fun
> function to the two collected lists and returns its result:
> 
> def multiremberandco1(el, seq, fun):
>     l1, l2 = [], []
>     for x in seq:
>         if x == el:
>             l2.append(el)
>         else:
>             l1.append(el)
>     return fun(l1, l2)
> 
> data = [1, 'a', 3, 'a', 4, 5, 6, 'a']
> print multiremberandco1('a', data, lambda l1,l2: (len(l1), len(l2)))
> 
> More compact:
> 
> def multiremberandco2(el, seq, fun):
>     l1, l2 = [], []
>     for x in seq:
>         [l1, l2][x == el].append(el)
>     return fun(l1, l2)
> 
> 
> A bit cleaner (but I don't like it much):
> 
> def multiremberandco3(el, seq, fun):
>     l1, l2 = [], []
>     for x in seq:
>         (l2 if x == el else l1).append(el)
>     return fun(l1, l2)
> 
> 
> For fun I have tried to make it lazy, if may be useful if seq is a
> very long iterable. So I've used tee:
> 
> from itertools import ifilter, tee
> 
> def multiremberandco4(el, iseq, fun):
>     iseq1, iseq2 = tee(iseq)
>     iterable1 = ifilter(lambda x: x == el, iseq1)
>     iterable2 = ifilter(lambda x: x != el, iseq2)
>     return fun(iterable1, iterable2)
> 
> def leniter(seq):
>     count = 0
>     for el in seq:
>         count += 1
>     return count
> 
> idata = iter(data)
> print multiremberandco4('a', idata, lambda l1,l2: (leniter(l1),
> leniter(l2)))
> 
> 
> But from the docs: >in general, if one iterator is going to use most
> or all of the data before the other iterator, it is faster to use
> list() instead of tee().<
> 
> So I have tried to create two iterables for the fun function scanning
> seq once only, but I haven't succed so far (to do it I have tried to
> use two coroutines with the enhanced generators, sending el to one or
> to the other according to the value of x == el, this time I don't show
> my failed versions), do you have suggestions?
> 
> (Note: in some situations it may be useful to create a "splitting"
> function that given an iterable and a fitering predicate, returns two
> lazy generators, of the items that satisfy the predicate and of the
> items that don't satisfy it, so this exercise isn't totally useless).
> 
> Bye,
> bearophile
> 


I think it might be provable that two iterators are required for this 
exercise. In this example, for example, leniter() will be called on one 
list and then the other. How will it get an accurate count of one list 
without iterating through the other?

I.e.:

     data = [1, 'a', 3, 'a', 4, 5, 6, 'a']
                                       ^
                   To count this 'a', it must have gone through the
                   numbers, so the leniter() can not act independently
                   on the lists using a single iterator.

James



More information about the Python-list mailing list