[Python-checkins] CVS: python/dist/src/Lib/test test_generators.py,1.11,1.12

Tim Peters tim_one@users.sourceforge.net
Wed, 27 Jun 2001 00:17:59 -0700


Update of /cvsroot/python/python/dist/src/Lib/test
In directory usw-pr-cvs1:/tmp/cvs-serv27500/python/dist/src/Lib/test

Modified Files:
	test_generators.py 
Log Message:
This no longer leaks memory when run in an infinite loop.  However,
that required explicitly calling LazyList.clear() in the two tests that
use LazyList (I added a LazyList Fibonacci generator too).

A real bitch:  the extremely inefficient first version of the 2-3-5 test
*looked* like a slow leak on Win98SE, but it wasn't "really":  it generated
so many results that the heap grew over 4Mb (tons of frames!  the number
of frames grows exponentially in that test).  Then Win98SE malloc() starts
fragmenting address space allocating more and more heaps, and the visible
memory use grew very slowly while the disk was thrashing like mad.
Printing fewer results (i.e., keeping the heap burden under 4Mb) made
that illusion vanish.

Looks like there's no hope for plugging the LazyList leaks automatically
short of adding frameobjects and genobjects to gc.  OTOH, they're very
easy to break by hand, and they're the only *kind* of plausibly realistic
leaks I've been able to provoke.

Dilemma.


Index: test_generators.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_generators.py,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -r1.11 -r1.12
*** test_generators.py	2001/06/26 22:24:51	1.11
--- test_generators.py	2001/06/27 07:17:57	1.12
***************
*** 446,449 ****
--- 446,450 ----
  [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
  
+ 
  Another famous problem:  generate all integers of the form
      2**i * 3**j  * 5**k
***************
*** 452,458 ****
  3 internal results for each result output.
  
- XXX Suspect there's memory leaks in this one; definitely in the next
- XXX version.
- 
  >>> def times(n, g):
  ...     for i in g:
--- 453,456 ----
***************
*** 476,484 ****
  ...             nh = h.next()
  
! This works, but is doing a whale of a lot or redundant work -- it's not
! clear how to get the internal uses of m235 to share a single generator.
! Note that me_times2 (etc) each need to see every element in the result
! sequence.  So this is an example where lazy lists are more natural (you
! can look at the head of a lazy list any number of times).
  
  >>> def m235():
--- 474,482 ----
  ...             nh = h.next()
  
! The following works, but is doing a whale of a lot of redundant work --
! it's not clear how to get the internal uses of m235 to share a single
! generator.  Note that me_times2 (etc) each need to see every element in the
! result sequence.  So this is an example where lazy lists are more natural
! (you can look at the head of a lazy list any number of times).
  
  >>> def m235():
***************
*** 492,503 ****
  ...         yield i
  
  >>> result = m235()
! >>> for i in range(5):
  ...     print firstn(result, 15)
  [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]
  [25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80]
  [81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
- [200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
- [400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
  
  Heh.  Here's one way to get a shared list, complete with an excruciating
--- 490,507 ----
  ...         yield i
  
+ Don't print "too many" of these -- the implementation above is extremely
+ inefficient:  each call of m235() leads to 3 recursive calls, and in
+ turn each of those 3 more, and so on, and so on, until we've descended
+ enough levels to satisfy the print stmts.  Very odd:  when I printed 5
+ lines of results below, this managed to screw up Win98's malloc in "the
+ usual" way, i.e. the heap grew over 4Mb so Win98 started fragmenting
+ address space, and it *looked* like a very slow leak.
+ 
  >>> result = m235()
! >>> for i in range(3):
  ...     print firstn(result, 15)
  [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]
  [25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80]
  [81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
  
  Heh.  Here's one way to get a shared list, complete with an excruciating
***************
*** 506,511 ****
  arguments are iterable -- a LazyList is the same as a generator to times().
  
- XXX Massive memory leaks in this; see Python-Iterators.
- 
  >>> class LazyList:
  ...     def __init__(self, g):
--- 510,513 ----
***************
*** 518,521 ****
--- 520,526 ----
  ...             sofar.append(fetch())
  ...         return sofar[i]
+ ...
+ ...     def clear(self):
+ ...         self.__dict__.clear()
  
  >>> def m235():
***************
*** 530,533 ****
--- 535,541 ----
  ...         yield i
  
+ Print as many of these as you like -- *this* implementation is memory-
+ efficient.  XXX Except that it leaks unless you clear the dict!
+ 
  >>> m235 = LazyList(m235())
  >>> for i in range(5):
***************
*** 538,541 ****
--- 546,577 ----
  [200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
  [400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
+ 
+ >>> m235.clear()  # XXX memory leak without this
+ 
+ 
+ Ye olde Fibonacci generator, LazyList style.
+ 
+ >>> def fibgen(a, b):
+ ...
+ ...     def sum(g, h):
+ ...         while 1:
+ ...             yield g.next() + h.next()
+ ...
+ ...     def tail(g):
+ ...         g.next()    # throw first away
+ ...         for x in g:
+ ...             yield x
+ ...
+ ...     yield a
+ ...     yield b
+ ...     for s in sum(iter(fib),
+ ...                  tail(iter(fib))):
+ ...         yield s
+ 
+ >>> fib = LazyList(fibgen(1, 2))
+ >>> firstn(iter(fib), 17)
+ [1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584]
+ 
+ >>> fib.clear()  # XXX memory leak without this
  """
  
***************
*** 670,740 ****
  <type 'None'>
  """
- 
- 
- x_tests = """
- 
- >>> def firstn(g, n):
- ...     return [g.next() for i in range(n)]
- 
- >>> def times(n, g):
- ...     for i in g:
- ...         yield n * i
- 
- >>> def merge(g, h):
- ...     ng = g.next()
- ...     nh = h.next()
- ...     while 1:
- ...         if ng < nh:
- ...             yield ng
- ...             ng = g.next()
- ...         elif ng > nh:
- ...             yield nh
- ...             nh = h.next()
- ...         else:
- ...             yield ng
- ...             ng = g.next()
- ...             nh = h.next()
- 
- >>> class LazyList:
- ...     def __init__(self, g):
- ...         self.sofar = []
- ...         self.fetch = g.next
- ...
- ...     def __getitem__(self, i):
- ...         sofar, fetch = self.sofar, self.fetch
- ...         while i >= len(sofar):
- ...             sofar.append(fetch())
- ...         return sofar[i]
- 
- >>> def m235():
- ...     yield 1
- ...     # Gack:  m235 below actually refers to a LazyList.
- ...     me_times2 = times(2, m235)
- ...     me_times3 = times(3, m235)
- ...     me_times5 = times(5, m235)
- ...     for i in merge(merge(me_times2,
- ...                          me_times3),
- ...                    me_times5):
- ...         yield i
- 
- >>> m235 = LazyList(m235())
- >>> for i in range(5):
- ...     x = [m235[j] for j in range(15*i, 15*(i+1))]
- 
- 
- [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]
- [25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80]
- [81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
- [200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
- [400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
- """
  
! __test__ = {"tut":      tutorial_tests,  # clean
!             "pep":      pep_tests,     # clean
!             "email":    email_tests,     # clean
!             "fun":      fun_tests,       # leaks
!             "syntax":   syntax_tests   # clean
!            #"x": x_tests
! }
  
  # Magic test name that regrtest.py invokes *after* importing this module.
--- 706,715 ----
  <type 'None'>
  """
  
! __test__ = {"tut":      tutorial_tests,
!             "pep":      pep_tests,
!             "email":    email_tests,
!             "fun":      fun_tests,
!             "syntax":   syntax_tests}
  
  # Magic test name that regrtest.py invokes *after* importing this module.
***************
*** 746,751 ****
      if 0:
          # Temporary block to help track down leaks.  So far, the blame
!         # has fallen mostly on doctest.
!         for i in range(5000):
              doctest.master = None
              doctest.testmod(test_generators)
--- 721,728 ----
      if 0:
          # Temporary block to help track down leaks.  So far, the blame
!         # fell mostly on doctest.  Later:  the only leaks remaining are
!         # in fun_tests, and only if you comment out the two LazyList.clear()
!         # calls.
!         for i in range(10000):
              doctest.master = None
              doctest.testmod(test_generators)