itertools.flatten()? and copying generators/iterators.

Francis Avila francisgavila at yahoo.com
Tue Oct 28 02:35:51 EST 2003


Below is an implementation a 'flattening' recursive generator (take a nested
iterator and remove all its nesting). Is this possibly general and useful
enough to be included in itertools? (I know *I* wanted something like it...)

Very basic examples:

>>> rl = [1, [2, 3, [4, 5], '678', 9]]
>>> list(flatten(rl))
[1, 2, 3, 4, 5, '6', '7', '8', 9]
>>> notstring = lambda obj: not isinstance(obj, type(''))
>>> list(flatten(rl, notstring))
[1, 2, 3, 4, 5, '678', 9]
>>> isstring = lambda obj: not notstring(obj)
>>> list(flatten(rl, isstring))
[1, [2, 3, [4, 5], '678', 9]]
>>> #The string is within a list, so we never descend that far.
>>> car_is_2 = lambda obj: isinstance(obj, type([])) and obj[0] == 2
>>> list(flatten(rl, car_is_2))
[1, 2, 3, [4, 5], '678', 9]
>>> rls = ['Here', 'are', ['some', ['nested'], 'strings']]
>>> list(flatten(rls))
['H', 'e', 'r', 'e', 'a', 'r', 'e', 's', 'o', 'm', 'e', 'n', 'e', 's', 't',
'e', 'd', 's', 't', 'r', 'i', 'n', 'g', 's']
>>> list(flatten(rls, notstring))
['Here', 'are', 'some', 'nested', 'strings']
>>> rli = iter([1, 2, iter(['abc', iter('ABC')]), 4])
>>> list(flatten(rli))
[1, 2, 'a', 'b', 'c', 'A', 'B', 'C', 4]
>>> list(flatten(rli, notstring))
[]
>>> #rli is an iterator, remember!
>>> rli = iter([1, 2, iter(['abc', iter('ABC')]), 4])
>>> list(flatten(rli, notstring))
[1, 2, 'abc', 'A', 'B', 'C', 4]
>>> # The following I'm not sure what to do about...
>>> empty = [1, [], 3]
>>> emptyiter = [1, iter([]), 3]
>>> list(flatten(empty))
[1, [], 3]
>>> list(flatten(emptyiter))
[1, 3]
>>>

I tried hard to get it to work with iterator and generator objects, too, and
it mostly does. However, I'm having some problems determining whether a
given object will iterate infinitely, if that object is already a
generator/iterator.  Basically, I'm unable to copy an iterator (why?).  See
isemptyorself() below for details.  Aside from that, I'm just generally
unsure what the proper behavior should be when an iterator/generator is
encountered.

Also, why is the iterator type not included in the types module or described
in the language reference (Python 2.2)?

--- Code ---


def isiterable(obj):
    try: iter(obj)
    except TypeError: return False
    else: return True

def isemptyorself(iterable):
    """True if iterable yields nothing or itself."""
    it = iter(iterable)

    # KLUDGE! This test must modify the object in order to test
    # it. This means that a value of iterable will be discarded!
    # Most likely, iterable is itself an iterator or generator,
    # because id(iter(GENR or ITER)) == id(GENR or ITER).
    # Unfortunately, we can't copy generators and iterators using
    # the copy module, so we must just assume that this iterator
    # doesn't yield itself or nothing....

    if it is iterable:
        return False

    try: res = it.next()
    except StopIteration: #Yields nothing
        return True
    else:
        if res == iterable: #Yields itself
            return True
    return False

def flatten(iterable, isnested=None):
    """Iterate items in iterable, descending into nested items.

    isnested is a function that returns true if the element of
    iterable should be descended into. The default is to
    consider iterable anything that iter() thinks is iterable (unless
    doing so would cause an infinite recursion).

    """
    if isnested is None:
        isnested = lambda obj: True #Always descend

    for item in iterable:
        if isiterable(item) and not isemptyorself(item) \
           and isnested(item):
            for subitem in flatten(item, isnested):
                yield subitem
        else:
            yield item


--
Francis Avila





More information about the Python-list mailing list