flattening a dict

Boris Borcic bborcic at gmail.com
Mon Feb 18 08:50:29 EST 2008


Arnaud Delobelle wrote:
> On Feb 17, 4:03 pm, Boris Borcic <bbor... at gmail.com> wrote:
>> George Sakkis wrote:
>>> On Feb 17, 7:51 am, Arnaud Delobelle <arno... at googlemail.com> wrote:
>>>> BTW, I keep using the idiom itertools.chain(*iterable).  I guess that
>>>> during function calls *iterable gets expanded to a tuple.  Wouldn't it
>>>> be nice to have an equivalent one-argument function that takes an
>>>> iterable of iterables and return the 'flattened' iterable?
>>> Indeed; I don't have any exact numbers but I roughly use this idiom as
>>> often or more as the case where chain() takes a known fixed number of
>>> arguments. The equivalent function you describe is trivial:
>>> def chain2(iter_of_iters):
>>>   for iterable in iter_of_iters:
>>>      for i in iterable:
>>>         yield i
>> or fwiw
>>
>> chainstar = lambda iters : (x for it in iters for x in it)
>>
>> - a form that better suggests how to inline it in the calling expression, if
>> applicable.
> 
> Indeed:
> 
> def flattendict(d):
>     def gen(d, pfx=()):
>         return (x for k, v in d.iteritems()
>                 for x in (gen(v, pfx+(k,)) if isinstance(v, dict)
>                           else ((pfx+(k,), v),)))
>     return dict(gen(d))
> 
> I don't know, I find the chain(*...) version more readable,

In this case I could concede that it is, although I find both forms overly
convoluted for easy reading.

> although
> this one is probably better.

It is more elementary in the mathematician's sense, and therefore preferable all
other things being equal, imo. I've tried to split 'gen' but I can't say the
result is so much better.

def flattendict(d) :
       gen = lambda L : (x for M in exp(L) for x in rec(M))
       exp = lambda L : (L+list(kv) for kv in L.pop().iteritems())
       rec = lambda M : gen(M) if isinstance(M[-1],dict) else [M]
       return dict((tuple(L[:-1]),L[-1]) for L in gen([d]))

For fun I've also written a (vaguely prologish) non-recursive generator-based 
version that exploits undocumented (but logical) behavior of list.extend

class iterstack(list) :
     __iter__ = lambda self : self
     def next(self) :
         while self :
             try :
                 return self[-1].next()
             except StopIteration :
                 self.pop()
         raise StopIteration

def flattendict(d,stk=[]) :
     res={}
     its=iterstack()
     its.extend(stk[-1].iteritems()
                for stk[len(its)-1:] in chain([[d]],its)
                if isinstance(stk[-1],dict)
                or res.update([(stk.pop(),tuple(stk))[::-1]]))
     return res

(testing was minimal)

Challenge for who really has time to waste : replace the iterstack list subclass 
definition with a simple list + a generator expression inside flattendict.

Cheers, BB




More information about the Python-list mailing list