itertools comments [Was: Re: RELEASED: Python 2.3a2]

Alexander Schmolck a.schmolck at gmx.net
Fri Feb 21 10:07:14 EST 2003


"Raymond Hettinger" <vze4rx4y at verizon.net> writes:

> > Wouldn't `xfilter`, `xmap` etc. be a better naming convention than
> > `ifilter`, `imap` etc?
> 
> There were a number of possible naming styles and, in the end,
> it is the module writer's perogative.

Point taken, but your itertools module is somewhat special in that it
effectively creates a naming convention for the *whole* language. It is
sometimes necessary to explicitly mark something as a generator or equivalent
(e.g. if there is also an equivalent sequence version) so there should be a
common way to do so and the precedent is set by your itertools module as the
first sanctified (general purpose) generator module.

So if you adhere to your naming scheme then I (and I guess others) also have
to adopt it (and change code). This in itself is unavoidable and I wouldn't
complain, if I hadn't the impression that a) there already is a (if maybe not
fully established) convention for `x` to denote 'lazy' and 'i' to denote
'inplace' and b) there should be as few such strange prefixes as possible and
c) just using 'i' for both leads to ambiguity.

For example I've got a (reasonably big) matrix library. Obviously, it is
necessary to provide facilities to do operations inplace for efficiency
reasons. So there is an `add` as well as an `iadd`, analogous to `__add__` and
`__iadd__` and similarly for other functions such as `sqrt` (which BTW
demonstrates how overused 'i' already is, as `isqrt` might as well mean
'complex squareroot'). I can't really see how else I could call the inplace
versions.

As an example how just prefixing 'i' for both 'inplace' and 'iterator' is
confusing: I've currently got a function called `iconcat` to destructively add
further sequences to the first argument, but it might as well denote what you
consider including as `chain`.

> X-men are mutants.  Ex-girlfriends are not positive either.

If you really don't like 'x', how about 'g' -- at least that isn't as
overloaded as 'i' (iterator, inplace, imaginary, interactive ...)?

> 
> This one would have a much better chance if some common
> use cases could be shown and if it can be demonstrated that
> it is a useful building block in combination with the other
> itertools.

Sure, I'm all for including conservatively and I really like the selection
you've made so far and, apart from the i prefix, the descriptive names you've
chosen (the only quibble is `ifilterfalse` -- it is not difficult to express
with `ifilter` and I find the name a bit opaque; how about
`(x/i)reject`?). The thing I'd most like to see included would be an `i/xcat`
or `chain` method, so add my vote for that. Anyway, before I'll give use cases
for xwindow below, how about collecting potential candidates (with python
implementation) on a wiki page?

> 
> 
> > def xwindow(iter, n=2, s=1):
> >     r"""Move an `n`-item (default 2) windows `s` steps (default 1) at a
> time
> >     over `iter`.
> 
> I've needed this one only one time in twenty-five years of programming
> (for the markov algorithm).  Again, solid use cases are needed.

I think iterating over direct neighbours is not uncommon. One case already
mentioned would to create edges out of vertices, but there are many
others. For example, I've used the xwindow construct for backward-forward
feature selection (which is analogous to the vertex case) and to created a
CamelCaseString out of a sequence of strings (which is, I guess similar to the
use of context information you made).

I can go into more detail, but maybe this gives the idea:

# last endpoint is next starting point
for (start, end) in xwindow(vertices): 
    f(start,end)
    ...

# context information
for (this, next) in xwindow(l): 
    if relationship(this, next): ...
    else: ...

Admittedly, you could already express this common case:

  xwindow(l)

using only 2 of your itertools:

  izip(l, islice(l,1))

but it is more verbose, a bit less efficient and I guess not as intuitive as
moving a window over a sequence.
 
 

> 
> > A last thing, I think a `cycle` generator would be an really useful idiom
> to
> > have,
> 
> It has proved useful in the SML world.
> I'm not sure it makes sense to implement it in C though.

Well, I don't know how much faster it would be, so I can't say. Even if there
isn't a compelling reason to implement it in C for speed the fact that it is
quite useful and logically belongs with the rest of itertools would be reason
enough, IMHO.


even simple cases such as

   for line, color in zip(lines, cycle('red', 'green', 'blue')):
       draw(line, color)

seem much nicer to me than:

   colors = 'red', 'green', 'blue'
   for i, line in enumerate(lines):
      draw(line, colors[i % len(colors)])

alex





More information about the Python-list mailing list