Real-world use cases for map's None fill-in feature?

Mon Jan 9 20:43:56 EST 2006

"Raymond Hettinger" <python at rcn.com> wrote:
> Duncan Booth wrote:
> > One example of padding out iterators (although I didn't use map's fill-in
> > to implement it) is turning a single column of items into a multi-column
> > table with the items laid out across the rows first. The last row may have
> > to be padded with some empty cells.
>
> ANALYSIS
> --------
>
> This case relies on the side-effects of zip's implementation details --
> the trick of windowing or data grouping with code like:  zip(it(),
> it(), it()).  The remaining challenge is handling missing values when
> the reshape operation produces a rectangular matrix with more elements
> than provided by the iterable input.
>
> The proposed function directly meets the challenge:
>
>     it = iter(iterable)
>     result = izip_longest(*[it]*group_size, pad='')
>
> Alternately, the need can be met with existing tools by pre-padding the
> iterator with enough extra values to fill any holes:
>
>     it = chain(iterable, repeat('', group_size-1))
>     result = izip_longest(*[it]*group_size)

I assumed you meant izip() here (and saw your followup)

> Both approaches require a certain meaure of inventiveness, rely on
> advacned tricks, and forgo readability to gain the raw speed and
> conciseness afforded by a clever use of itertools.  They are also a
> challenge to review, test, modify, read, or explain to others.

The inventiveness is in the "(*[it]*group_size, " part.  The
rest is straight forward (assuming of course that itertools
has good documentation, and it was read first.)

> In contrast, a simple generator is trivially easy to create and read,
> albiet less concise and not as speedy:
>
>     it = iter(iterable)
>     while 1:
>         row = tuple(islice(it, group_size))
>         if len(row) == group_size:
>             yield row
>         else:
>             yield row + ('',) * (group_size - len(row))
>             break

Yes with 4 times the amount of code.   (Yes, I am
one of those who believes production and maintence
cost is, under many circumstances, roughly correlated
with LOC.

An frankly, I don't find the above any more
comprehensible than:
>     result = izip_longest(*[it]*group_size, pad='')
once a little thought is given to the *[it]*group_size,
part.  I see much more opaque code everytime
I look at source code in the standard library.

> The generator version is plain, simple, boring, and uninspirational.
> But it took only seconds to write and did not require a knowledge of
> advanced itertool combinations.

"advanced itertool combinations"??  Even I, newbie
that I am, found the concepts of repeat() and chain()
pretty straight forward.  Of course having to
understand/use 3 itertools tools is more difficult
than understanding one (izip_longest).  Better
documentation could mitigate that a lot.
But the solution using "advanced itertool combinations"
was your's, avoided altogether with an izip_long().

Also this same argument (uses of x can be easily
coded without x by using a generator) is equally
applicable to itertools.izip() itself, yes?

> It more easily explained than the versions with zip tricks.

Calling this a "trick" is unfair.  The (current pre-2.5)
documentation still mentions no requirement that
izip() arguments be independent (despite the fact
that this issue was discussed here a couple months
ago as I remember.  If I remember it was not clear if
that should be a requirement or not, since it would
prevent any use of the same iterable more than
once in izip's arg list, it has not been documented
for 3(?) Python versions, and clearly people are
using the current behavior.