[Python-Dev] Memory woes under Windows

Tim Peters tim_one@email.msn.com
Sun, 14 May 2000 19:51:41 -0400


[Noah, I'm wondering whether this is related to our W98 NatSpeak woes --
Python grows its lists much like a certain product we both work on <ahem>
grows its arrays ...]

Here's a simple test case:

from time import clock

def run():
    n = 1
    while n < 4000000:
        a = []
        push = a.append
        start = clock()
        for i in xrange(n):
            push(1)
        finish = clock()
        print "%10d push  %10.3f" % (n, round(finish - start, 3))
        n = n + n

for i in (1, 2, 3):
    try:
        run()
    except MemoryError:
        print "Got a memory error"

So run() builds a number of power-of-2 sized lists, each by appending one
element at a time.  It prints the list length and elapsed time to build each
one (on Windows, this is basically wall-clock time, and is derived from the
Pentium's high-resolution cycle timer).  The driver simply runs this 3
times, reporting any MemoryError that pops up.

The largest array constructed has 2M elements, so consumes about 8Mb -- no
big deal on most machines these days.

Here's what happens on my new laptop (damn, this thing is fast! -- usually):

Win98 (Second Edition)
600MHz Pentium III
160Mb RAM
Python 1.6a2 from python.org, via the Windows installer

         1 push       0.000
         2 push       0.000
         4 push       0.000
         8 push       0.000
        16 push       0.000
        32 push       0.000
        64 push       0.000
       128 push       0.000
       256 push       0.001
       512 push       0.001
      1024 push       0.003
      2048 push       0.011
      4096 push       0.020
      8192 push       0.053
     16384 push       0.074
     32768 push       0.163
     65536 push       0.262
    131072 push       0.514
    262144 push       0.713
    524288 push       1.440
   1048576 push       2.961
Got a memory error
         1 push       0.000
         2 push       0.000
         4 push       0.000
         8 push       0.000
        16 push       0.000
        32 push       0.000
        64 push       0.000
       128 push       0.000
       256 push       0.001
       512 push       0.001
      1024 push       0.003
      2048 push       0.007
      4096 push       0.014
      8192 push       0.029
     16384 push       0.057
     32768 push       0.116
     65536 push       0.231
    131072 push       0.474
    262144 push       2.361
    524288 push      24.059
   1048576 push      67.492
Got a memory error
         1 push       0.000
         2 push       0.000
         4 push       0.000
         8 push       0.000
        16 push       0.000
        32 push       0.000
        64 push       0.000
       128 push       0.000
       256 push       0.001
       512 push       0.001
      1024 push       0.003
      2048 push       0.007
      4096 push       0.014
      8192 push       0.028
     16384 push       0.057
     32768 push       0.115
     65536 push       0.232
    131072 push       0.462
    262144 push       2.349
    524288 push      23.982
   1048576 push      67.257
Got a memory error

Commentary:  The first time it runs, the timing behavior is
indistinguishable from O(N).  But realloc returns NULL at some point when
growing the 2M array!  There "should be" huge gobs of memory available.

The 2nd and 3rd runs are very similar to each other, both blow up at about
the same time, but both run *very* much slower than the 1st run before that
point as the list size gets non-trivial -- and, while the output doesn't
show this, the disk starts thrashing too.

It's *not* the case that Win98 won't give Python more than 8Mb of memory.
For example,

>>> a = [1]*30000000  # that's 30M
>>>

works fine and fast on this machine, with no visible disk traffic [Noah,
that line sucks up about 120Mb from malloc in one shot].

So, somehow or other, masses of allocations are confusing the system memory
manager nearly to death (implying we should use Vladimir's PyMalloc under
Windows after grabbing every byte the machine has <0.6 wink>).

My belief is that the Windows 1.6a2 from python.org was compiled with VC6,
yes?  Scream if that's wrong.

This particular test case doesn't run any better under my Win95 (original)
P5-166 with 32Mb RAM using Python 1.5.2.  But at work, we've got a
(unfortunately huge, and C++) program that runs much slower on a
large-memory W98 machine than a small-memory W95 one, due to disk thrashing.
It's a mystery!  If anyone has a clue about any of this, spit it out <wink>.

[Noah, I watched the disk cache size while running the above, and it's not
the problem -- while W98 had allocated about 100Mb for disk cache at the
start, it gracefully gave that up as the program's memory demands increased]

just-another-day-with-windows-ly y'rs  - tim