Real-world use cases for map's None fill-in feature?

Raymond Hettinger python at rcn.com
Sun Jan 8 23:28:01 EST 2006


Proposal
--------
I am gathering data to evaluate a request for an alternate version of
itertools.izip() with a None fill-in feature like that for the built-in
map() function:

>>> map(None, 'abc', '12345')   # demonstrate map's None fill-in feature
[('a', '1'), ('b', '2'), ('c', '3'), (None, '4'), (None, '5')]

The motivation is to provide a means for looping over all data elements
when the input lengths are unequal.  The question of the day is whether
that is both a common need and a good approach to real-world problems.
The answer can likely be found in results from other programming
languages and from surveying real-world Python code.

Other languages
---------------
I scanned the docs for Haskell, SML, and Perl6's yen operator and found
that the norm for map() and zip() is to truncate to the shortest input
or raise an exception for unequal input lengths.  Ruby takes the
opposite approach and fills-in nil values -- the reasoning behind the
design choice is somewhat inscrutable:  
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/18651 

Real-world code
---------------
I scanned the standard library, my own code, and a few third-party
tools.  I
found no instances where map's fill-in feature was used.

History of zip()
----------------
PEP 201 (lock-step iteration) documents that a fill-in feature was
contemplated and rejected for the zip() built-in introduced in Py2.0.
In the years before and after, SourceForge logs show no requests for a
fill-in feature.

Request for more information
----------------------------
My request for readers of comp.lang.python is to search your own code
to see if map's None fill-in feature was ever used in real-world code
(not toy examples).  I'm curious about the context, how it was used,
and what alternatives were rejected (i.e. did the fill-in feature
improve the code).  Likewise, I'm curious as to whether anyone has seen
a zip-style fill-in feature employed to good effect in some other
programming language.

Parallel to SQL?
----------------
If an iterator element's ordinal position were considered as a record
key, then the proposal equates to a database-style full outer join
operation (one which includes unmatched keys in the result) where record
order is significant.  Does an outer-join have anything to do with
lock-step iteration?  Is this a fundamental looping construct or just a
theoretical wish-list item?  Does Python need itertools.izip_longest()
or would it just become a distracting piece of cruft?



Raymond Hettinger


FWIW, the OP's use case involved printing files in multiple
columns:

    for f, g in itertools.izip_longest(file1, file2, fillin_value=''):
        print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())

The alternative was straightforward but less terse:

    while 1:
        f = file1.readline()
        g = file2.readline()
        if not f and not g:
            break
        print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())



More information about the Python-list mailing list