split encloser

Arm armchairmillions at hotmail.com
Tue Apr 22 00:04:48 EDT 2003


> > I had some small exposure to "iterators" in C++, and I remember them
> > being used in loops - iterating over the items in a container. The
> > example you have given appears to create a list from an "iterator"
> > with no loop. Something called an "iterator" creating a list with no
> > loop is very perplexing to me. But maybe my problem is I don't
> > recognize the syntax [ mo.group(0) for ... ]. It looks like a method
> > call on an object, followed by an incomplete for loop.
> 
> the syntax:
> 
>     [ <expression> for <target> in <iterable> ]
> 
> is known in Python as a "List Comprehension".  There is no issue of
> "with no loop" -- the "for" keyword inside the list comprehension
> is indicating exactly the fact that the loop is taking place.  (List
> Comprehensions may optionally have other clauses too -- for clauses
> to indicate nested loops, if clauses to use only SOME of the items
> in an iterable -- but I'm using the simplest form in the above).
> 
> If you're familiar with Python 1.5.2 but none of the Python changes
> in the last 4 years or so, then one way to explain list comprehensions
> is to say that
> 
>     return [ <expression> for <target> in <iterable> ]
> 
> is just the same as:
> 
>     templist = []
>     for <target> in <iterable> :
>         templist.append( <expression> )
>     return templist
> 
> The list comprehension notation is a Pythonization of Haskell's, which
> uses punctuation rather than the for and in keywords -- it would say:
>     [ <expression> | <target> <- <iterable> ]
> which reads just fine in Haskell but definitely not in Python; in turn,
> Haskell's list comprehension notation is a simple adaptation of the
> widespread "set comprehension" notation used in maths, e.g.
>     { x*x | x <- S }
> to mean "the set of the squares of the elements of set S" (where
> the <- set-membership indicato is often indicated by some glyph
> that's more reminiscent of the Greek letter epsilon).
> 
> 
> Python iterators are rather different beasts from C++'s, though
> the analogy between a Python iterator and a C++ "input iterator"
> is reasonably good... where in C++ you might have *it++ to
> indicate "fetch the current value of the iterator then advance
> the iterator", in Python you'd have it.next() for just the same
> purpose [the for loop makes that call intrinsically on your
> behalf!]; where in C++ you need to find out whether an iterator
> is done by comparing it with some "end marker", in Python an
> iterator lets you know it's done by raising StopIteration when
> you call its .next method [again, the for loop handles that for
> you intrinsically, catching StopIteration and just taking it as
> the indicator for normal termination of the iteration/loop].
> 
> 
> So, back to our muttons: are.finditer(instring) returns an iterable
> (which is actually already an iterator, but that's not very
> important here) whose items are the match-objects for each of
> the nonoverlapping matches of compiled regular expression object
> are inside string instring.  So, a loop of the form:
>     for mo in are.finditer(instring):
>         ...
> (whether written out like this, or inside a list comprehension,
> makes no difference) makes mo assume, one after the other, the
> values of RE matchobjects for each of those non-overlapping
> matches, left to right inside instring.  So for example a call
> to mo.group(0) gives the substring of instring that corresponds
> to the specific nonoverlapping match we're currently at, inside
> a loop such as the above.
> 
> 
Alex,
Thanks for the great explanation. It seems like both List
Comprehension and Iterators could use more documentation on
python.org. I think it would have been a lot more straightforward and
easier to comprehend (at least for beginners), if the object of a
"list comprehension" had been limited to a "sequence" or an
"iterable". (I would even prefer a different keyword when using
iterables). It doesn't make sense to be iterating through an iterator.
What is the gain from allowing the iterator to be placed in the
position a container should be - given that iterable
containers/objects must return the iterator anyway? And it seems like
in order to implement the "iterating through an iterator" syntax, an
equally confusing call was added to iterators - "return self".




More information about the Python-list mailing list