flatten(), [was Re: map/filter/reduce/lambda opinions ...]

Ron Adam rrr at ronadam.com
Wed Jul 6 23:39:52 EDT 2005


Stian Søiland wrote:

> Or what about a recursive generator?
> 
>     a = [1,2,[[3,4],5,6],7,8,[9],[],]
> 
>     def flatten(item):
>         try:
>             iterable = iter(item)
>         except TypeError:
>             yield item  # inner/final clause
>         else:
>             for elem in iterable:
>                 # yield_return flatten(elem)
>                 for x in flatten(elem):
>                     yield x
> 
>     print list(flatten(a))


Ok,  lets see...  I found a few problems with the testing (and corrected 
it)  so the scores have changed.   My sort in place routines were 
cheating because the lists weren't reset between runs so it had the 
advantage after the first time though of sorting already sorted list. 
And Tom's recursive no copy had a bug which kept a reference to one of 
it's arguments so it's output was doubling the list.  And the hasattr 
function was slowing everyone down, so I inlined it for everyone who 
used it.

Using a 1000 item list and starting with a flat list and increasing the 
depth (unflatten) to shallow, medium, and deep.  (but not so deep to 
cause recursion errors.)

And the winners are...



Python 2.3.5 (#62, Feb  8 2005, 16:23:02)
[MSC v.1200 32 bit (Intel)] on win32
IDLE 1.0.5

 >>>

Size: 1000   Depth: 0
georges_recursive_flatten:           0.00212513042834
rons_nonrecursive_flatten:           0.00394323859609
toms_recursive_zerocopy_flatten:     0.00254557492644
toms_iterative_zerocopy_flatten:     0.0024332701505
devans_smallest_recursive_flatten:   0.011406198274
rons_nonrecursive_inplace_flatten:   0.00149963193644
stians_flatten_generator:            0.00798257879114

Size: 1000   Depth: 25
georges_recursive_flatten:           0.0639824335217
rons_nonrecursive_flatten:           0.0853463219487
toms_recursive_zerocopy_flatten:     0.0471856059917
toms_iterative_zerocopy_flatten:     0.188437915992
devans_smallest_recursive_flatten:   0.0844073757976
rons_nonrecursive_inplace_flatten:   0.0847048996452
stians_flatten_generator:            0.0495694285169

Size: 1000   Depth: 50
georges_recursive_flatten:           0.134300309118
rons_nonrecursive_flatten:           0.183646245542
toms_recursive_zerocopy_flatten:     0.0886252303017
toms_iterative_zerocopy_flatten:     0.371141304272
devans_smallest_recursive_flatten:   0.185467985456
rons_nonrecursive_inplace_flatten:   0.188668392212
stians_flatten_generator:            0.090114246364

Size: 1000   Depth: 100
georges_recursive_flatten:           0.248168133101
rons_nonrecursive_flatten:           0.380992276951
toms_recursive_zerocopy_flatten:     0.177362486014
toms_iterative_zerocopy_flatten:     0.741958265645
devans_smallest_recursive_flatten:   0.306604051632
rons_nonrecursive_inplace_flatten:   0.393641091256
stians_flatten_generator:            0.177185368532
 >>>


Stians flatten generator is nearly tied with Tom's recursive zerocopy. 
My nonrecursive inplace is faster for very shallow lists, but Tom's 
quickly over takes it.  I was able to improve my nonrecursive copy 
flatten a bit, but not enough to matter.  So Tom's recursive zerocopy is 
the overall winner with Stian's flatten generator close behind and just 
barely winning out in the very deep catagory.  But they're all 
respectable times so everyone wins. ;-)



And here's the source code.

Cheers,  :-)
Ron



# -------------------------------

import sys
import time

TIMERS = {"win32": time.clock}
timer = TIMERS.get(sys.platform, time.time)

def timeit(fn,*arg):
     t0 = timer()
     r = fn(*arg)
     t1 = timer()
     return float(t1-t0), r

# --------------------------------

def georges_recursive_flatten(seq):
     return reduce(_accum, seq, [])

def _accum(a, item):
     if hasattr(item, "__iter__"):
         a.extend(georges_recursive_flatten(item))
     else:
         a.append(item)
     return a


def rons_nonrecursive_flatten(seq):
     a = []
     while seq:
         if hasattr(seq[0], "__iter__"):
             seq[0:1] = seq[0]
         else:
             a.append(seq.pop(0))
     return a


def toms_recursive_zerocopy_flatten(seq, a=None):  #a=[] kept a 
reference to a?
     if a==None:
         a = []
     if hasattr(seq,"__iter__"):
         for item in seq:
             toms_recursive_zerocopy_flatten(item, a)
     else:
         a.append(seq)
     return a


def toms_iterative_zerocopy_flatten(seq):
     stack = [None]
     cur = iter(seq)
     a = []
     while (cur != None):
         try:
             item = cur.next()
             if hasattr(item,"__iter__"):
                 stack.append(cur)
                 cur = iter(item)
             else:
                 a.append(item)
         except StopIteration:
             cur = stack.pop()
     return a


def devans_smallest_recursive_flatten(seq):
     if hasattr(seq,"__iter__"):
         return sum([devans_smallest_recursive_flatten(item) for item in 
seq], [])
     else:
         return [seq]


def rons_nonrecursive_inplace_flatten(seq):
     i = 0
     while i != len(seq):
         if hasattr(seq[i],"__iter__"):
             seq[i:(i + 1)] = seq[i] # setslice takes iterators!
         else:
             i = i + 1
     return seq


def flatten_generator(item):
     try:
         iterable = iter(item)
     except TypeError:
         yield item  # inner/final clause
     else:
         for elem in iterable:
             # yield_return flatten(elem)
             for x in flatten_generator(elem):
                 yield x

def stians_flatten_generator(seq):
     return list(flatten_generator(seq))

# ------------------------------------------

flattens = [
     georges_recursive_flatten,
     rons_nonrecursive_flatten,
     toms_recursive_zerocopy_flatten,
     toms_iterative_zerocopy_flatten,
     devans_smallest_recursive_flatten,
     rons_nonrecursive_inplace_flatten,
     stians_flatten_generator
]


import random
import time
random.seed(time.time())
def rand_depth_sequence(seq,depth):
     n = len(seq)*depth
     nn = 0
     while nn<n:
         try:
             step = random.randint(1,3)
             start = random.randint(0,(len(seq)-step))
             end = random.randint(start,start+step)
             seq[start:end]=[seq[start:end]]
             nn += 1
         except:
             pass
     return seq
#print sequence

size = 1000
original = range(size)
for depth in [0,25,50,100]:
     sequence = rand_depth_sequence(original[:],depth)
     print "\nSize:",size,"  Depth:",depth
     for flatten in flattens:
         s = sequence[:]  # make new copy!
         tm, result = timeit(flatten,s)
         if result != original:  # check for valid output!
             print "Result Error from", flatten
             print original
             print result
             print sequence
             print s
             break
         f_name = flatten.func_name
         print "%s: %s%s" % (f_name, ' '*(35-len(f_name)),tm)
         # clear values for next test
         del result
         del s


# ----










More information about the Python-list mailing list