Simple exercise

Oscar Benjamin oscar.j.benjamin at gmail.com
Tue Mar 15 07:09:15 EDT 2016


On 14 March 2016 at 23:59, Steven D'Aprano <steve at pearwood.info> wrote:
> On Tue, 15 Mar 2016 02:06 am, Oscar Benjamin wrote:
>
>> On 14 March 2016 at 14:35, Rick Johnson <rantingrickjohnson at gmail.com>
>> wrote:
>>>
>>> I would strongly warn anyone against using the zip function
>>> unless
>> ...
>>> I meant to say: absolutely, one hundred percent *SURE*, that
>>> both sequences are of the same length, or, absolutely one
>>> hundred percent *SURE*, that dropping values is not going to
>>> matter. For that reason, i avoid the zip function like the
>>> plague. I would much rather get an index error, than let an
>>> error pass silently.
>>
>> I also think it's unfortunate that zip silently discards items.
>
> Are you aware of itertools.zip_longest?

I am.

> That makes it easy to build a zip_strict:
>
> def zip_strict(*iterables):
>     pad = object()
>     for t in itertools.zip_longest(*iterables, fillvalue=pad):
>         if pad in t:
>             raise ValueError("iterables of different length")
>         yield t

There are many ways to build a zipstrict. As I said in my own usage of
zip I would almost always want it to raise an error because I almost
always give zip iterables of the same length. However the situation
where zipstrict would benefit is often the kind of situation where
you're not really thinking about the fact that zip truncates. Also if
you only have one zip call in a script it'd be easier (and clearer) to
write:

    if len(x) != len(y):
        raise ValueError

> Unfortunate or not, it seems to be quite common that "zip" (convolution)
> discards items when sequences are of different lengths. I think the usual
> intent is so that you can zip an infinite (or near infinite) sequence of
> counters 1, 2, 3, 4, ... with the sequence you actually want, to get the
> equivalent of Python's enumerate().

That's fine but in the Python code I see zip is much more often used
with equal length finite iterables than with infinite ones. One of the
things I like about Python (especially since I'm learning JS right
now) that you can write your code in the obvious way and then most of
your error checking comes for free: I want loud error messages instead
of corrupted data. I would rather have the potentially bug-prone
zip_shortest be an opt-in itertools feature and zip_strict the default
behaviour for zip. I realise it's not going to change but I think it
would be better.

--
Oscar



More information about the Python-list mailing list