[Edu-sig] about iterators, range, and summing consecutive integers

kirby urner kirby.urner at gmail.com
Sat Mar 27 06:38:18 CET 2010


Here are some looks at summing consecutive integers, ways
to sum 1 + 2 + ... + n in Python:

We start with the closed form, which doesn't actually need
to iterate through intermediate terms, works directly with n,
the last term in the sequence:

>>> def trinumbs(n):
	return n * (n + 1)  // 2

These next two make use of all the intermediate values:

>>> def trinumbs2(n):
	return sum(range(1,n+1))

>>> def trinumbs3(n):
	i = 1
	thesum = 0
	while i <= n:
		thesum = thesum + i
		i += 1
	return thesum

<Lore>

Although all of these give the same answers, between the last two,
the very last one is the most "classic" in terms of what an old time
imperative programmer would expect.  Having a range() function
built right into the language is more APL-like.  Most "classic"
imperative languages such as FORTRAN and PL/1 required the
programmer to do all the incrementing "by hand" (explicitly).

You will find more about the range() function below.

Notice the two ways of incrementing and rebinding.  When
writing thesum = thesum + i, the name 'thesum' is being rebound
to the result of thesum plus whatever value i names.

The equal sign (=) is not used to assert equality but as an
"assignment operator".  Objects have names and the assignment
operator is what binds a name to an object.

i += 1 is a shorter way of saying i = i + 1 and is borrowed from
C / C++, is a much used convention.

</Lore>

Python has very few control statements.  Another way to write the
above would be:

>>> def trinumbs4(n):
	i = 1
	thesum = 0
	for i in range(1, n+1):
		thesum = thesum + i
	return thesum

The keywords 'while' and 'for' are the two main looping constructs.

trinumbs2 is making use of another built-in that directly consumes
the output of range( ), and that is the sum( ) function.

Since range( ) is zero-indexed, and since our tacit understanding
is we're working with "natural" or "counting" numbers starting with 1,
both uses of range (in trinumbs2 and trinumbs4) bump up the
starting and ending parameters.

If you feed only one argument to range, this is mapped to its
ending parameter with the start parameter defaulting to 0.

range ( 10 ) is equivalent to range (start = 0, stop = 10), except
range doesn't take keyword arguments, so you can't actually
get away the the 2nd expression.

Examples:

Only one parameter, so an ending parameter:

>>> list ( range( 10 ))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Two parameters, so a starting and ending parameter:

>>> list ( range(1, 11 ))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Three parameters, so a starting, ending and "increment by"
or "step" parameter:

>>> range(2,31,2)
range(2, 31, 2)
>>> list ( range(2,31,2) )
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

Yes, you may increment negatively, but then remember not to
make your starting value less than your stopping value:

>>> list ( range(1, 10, -1) )
[]

>>> list ( range(10, 0, -1) )
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

range(5) returns a "range object" (of type range) which may in turn
be interacted with in various ways:

>>> d = range(5)

>>> d
range(0, 5)

>>> type(d)
<class 'range'>

>>> list(d)
[0, 1, 2, 3, 4]

>>> tuple(d)
(0, 1, 2, 3, 4)

>>> sum(d)
10

How else might we address a Python range object?

With slicing notation:

>>> d[1]
1
>>> d[4]
4

Although it's an iterable (you may iterate over it, using 'for'), d is
not an iterator, meaning it's not supportive of 'next' (as no __next__ rib
internally).

The built-in iter function will trigger an object's internal __iter__ method,
if there is one, and return an iterator prepared to be eaten by next().

>>> next(d)
Traceback (most recent call last):
  File "<pyshell#44>", line 1, in <module>
    next(d)
TypeError: range object is not an iterator

>>> k = iter(d)

>>> next(k)
0
>>> next(k)
1
>>> next(k)
2

>>> d
range(0, 5)

>>> j = iter(d)

>>> next(j)
0

>>> next(k)
3

Notice that k and j above are two separate iterators and
they go through to exhaustion independently of one
another.  Once an iterator has been exhausted, it will
no longer return contents, including in response to
the print function.

>>> theiterator = iter(d)
>>> next(theiterator)
0
>>> next(theiterator)
1
>>> print(theiterator)
<range_iterator object at 0x0114AA28>
>>> print(list(theiterator))
[2, 3, 4]
>>> next(theiterator)
Traceback (most recent call last):
  File "<pyshell#68>", line 1, in <module>
    next(theiterator)
StopIteration

In the above example, d (which is our range object) is
the source of a fresh iterator, theiterator.  After nexting
through two terms, the print function is used.

The first use simply returns the type of the object, not
its contents.  However, by feeding theiterator to the
list function, we cause it to squirt out the rest of its
contents, meaning it is now exhausted.

When theiterator is then fed to the next method, it
raises an exception, a StopIteration.

>>> def test():
	try:
		next(theiterator)
	except StopIteration:
		print("I'm exhausted")

>>> test()
I'm exhausted

Note that the range object is happy to return a "reversed"
iterator, so instead of stepping by -1, you might do something
like:

>>> theiterator = reversed(range(10))
>>> next(theiterator)
9
>>> next(theiterator)
8
>>> for item in theiterator:  print (item, end=" ")

7 6 5 4 3 2 1 0

Here's an author unhappy with Python's iterators.  His
commentators debate him.  You may learn more about
a language by reading the arguments people have
about it.  What's frustrating.  What's considered "a wart".

http://sandersn.com/blog//index.php/2009/06/29/python_s_iterators_are_a_bad_implementat

You might imagine how iterators could contribute to a
programmer's confusion, when trying to debug code.
They're not like collections of static data, are more like
"mortal coils" or one-time-through sequences.  An
iterator passed to a function and partially advanced,
with have been modified "off stage".

>>> theiterator = reversed(range(10))

>>> next(theiterator)
9

This function returns nothing, but has the effect of
advancing the passed in iterator.  The object being
passed in is bound to the name theinput for local
processing, which name goes out of scope when
the function is finished.  However, the object still
persists "on the outside" and has been affected by
what went on in this function.

>>> def squeeze_me( theinput, howmany):
	for i in range( howmany ):
		next( theinput )

		
>>> squeeze_me ( theiterator, 3 )

>>> next(theiterator)
5

<Arcane>

Here's an instructive example of changes to Python's
behavior resulting in new wrinkles:

http://lists.fedoraproject.org/pipermail/devel/2009-September/037966.html

The __length_hint__ attribute (mentioned in the above report)
is one you might use to get a sense of how close your iterator
might be to exhaustion.  When converting an iterator to a list,
this attribute is apparently consulted, which is what was causing
problems in a user-defined class (__get_attr__ had been redefined).
This problem has apparently been resolved in Python 3.x as
I'm getting the expected behavior.

</Arcane>

More dialog with "the snake":

>>> thelist = ['a',1,2,['another','list']]

>>> next(thelist)
Traceback (most recent call last):
  File "<pyshell#55>", line 1, in <module>
    next(thelist)
TypeError: list object is not an iterator

>>> '__iter__' in dir(thelist)
True

>>> theiterator = iter(thelist)

>>> next(theiterator)
'a'

>>> next(theiterator)
1

>>> next(theiterator)
2

>>> next(theiterator)
['another', 'list']

>>> next(theiterator)
Traceback (most recent call last):
  File "<pyshell#62>", line 1, in <module>
    next(theiterator)
StopIteration

>>>

For further reading:
http://worldgame.blogspot.com/2009/11/kicking-can-down-road.html

Kirby


More information about the Edu-sig mailing list