gobbling (was Re: What is the best way to merge two lists?)

Alex Martelli aleax at aleax.it
Mon Nov 18 06:04:36 EST 2002


Gonçalo Rodrigues wrote:
   ...
> I believe it has also been made stable-sort. The con is that it gobbles
> up more memory.

Here's a first attempt to quantify that "gobbling":

import sys
import random

# memsize -- theoretically-portable version
import resource
pagesize = float(resource.getpagesize())
mega = 1024 * 1024.0
def megs(x):
    return '%.2f' % (x*pagesize / mega)
def vmegs():
    res = resource.getrusage(resource.RUSAGE_SELF)
    return res[4]
# Linux-only hack
pagesize = 1
def vmegs():
    mstr = open('/proc/self/stat').read().split()[22]
    return int(mstr)

maxmem = 0
def memsnap():
    global maxmem
    mem = vmegs()
    if mem>maxmem: maxmem=mem

def cmp_and_snap(x, y):
    memsnap()
    return cmp(x, y)

random.seed(234876)
N = 5000
if len(sys.argv)>1:
    N = int(sys.argv[1])
data = range(N)
random.shuffle(data)

startmem = megs(vmegs())

data.sort(cmp_and_snap)

print "%s -- %d: %s -> %s" % (sys.version, N, startmem, megs(maxmem))



[alex at lancelot alex]$ python2.2 some.py 20000
2.2.2 (#1, Oct 24 2002, 11:43:01)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)] -- 20000: 3.10 -> 3.10
[alex at lancelot alex]$ python2.3 some.py 20000
2.3a0 (#4, Nov  6 2002, 09:18:17)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)] -- 20000: 3.55 -> 3.61
[alex at lancelot alex]$ python2.2 some.py 100000
2.2.2 (#1, Oct 24 2002, 11:43:01)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)] -- 100000: 4.33 -> 
4.33
[alex at lancelot alex]$ python2.3 some.py 100000
2.3a0 (#4, Nov  6 2002, 09:18:17)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)] -- 100000: 4.78 -> 
4.97
[alex at lancelot alex]$

A first assessment: 2.3 takes up more memory than 2.2  -- about 0.45 MB 
more independently of the tasks involved.  2.3 also takes up extra memory 
for its sort operation -- here, about 0.06 MB more when sorting a 20,000 
items list, about 0.19 MB more when sorting a 100,000 items list.

There MAY be limit-cases where this extra memory consumption does hit
performance -- cases where 2.2 was just on the verge of VM thrashing
and the small extra memory requested by 2.3 pushes the process just
above the edge -- but only if the working set is also increasing, which
I do not yet _know_ (the above is only meeasuring the overall memory
footprint, not the working set); and they should be rare cases.

Tentative conclusion: for now, I wouldn't particularly worry about
the extra memory consumption, pending of course more investigation.


Alex




More information about the Python-list mailing list