Memory ?

Fri Jul 5 10:45:05 EDT 2002

Shagshag13 wrote:

> 
> Hello,
> 
> i'm still looking for a way to check which are the best ways to save
> memory because i work on many GB with float, etc. so is there a way to
> check how many bytes needs an object, a tuple, a list, a dict of objects
> and so on ? (something like itemsize for array - in fact array should be
> great if they weren't boxed)

I'm not sure what you mean by "boxed" in this context.  If you mean that
arrays can only hold items of one type, well, that's how they get most
of their memory-savings!  An array of N floats for example takes 8N+x
bytes for a rather small x, a list of N float Python objects MUST take
at least twice as much -- each item in the list is a pointer (there go
4N bytes at least) to an object which, besides holding a float, must
take at least another 4 more bytes for a pointer to the float type --
so, at least 16N + y for some small y.

Package Numeric offer powerful arrays that (if I get your meaning
right) can be "unboxed", i.e., hold arbitrary objects (just use
typecode 'O'), but then it's unlikely this will save you any really
significant amount of memory.

How many bytes an object takes is a rather fuzzy concept and there's
no single way to answer this at the Python level.  Studying the C
sources on the specific machine you're interested in will start to
give you an answer, but in most cases you're only halfway.  Since
no object "owns" another, but rather each object is held alive by any
number of references to it, as soon as there's more than one reference
you have to take arbitrary decision about how to account for the bytes
the object takes up.  Plus, there's always "waste" of some sort or
other -- e.g. depending on your machine's malloc, or pymalloc if your
Python implementation uses that, but, moreover, dicts are hash tables
that are never more than some-fraction full, lists most always have
some extra 'free' slots allocated at the end, etc, etc.  It's not
clear which heuristics will be of practical use to you.

> i also have troubles like :
> 
> - which way is the best to handle a sequential dictionary : having 2

This reminds me of Alice asking about which way she should go - that
very much depends, goes the answer, on where you'd like to arrive.

> dicts, one for the order and other for my data ? but i think, it's better
> to use a list to handle order an a dict for data (like in seqdict)

What ARE your purposes?  What operations do you want to optimize along
which axes?  If you have a list and need to insert or remove at arbitrary
points you get O(N) performance, which can be a killer -- dictionaries
don't suffer from that.  But if you don't need such alterations and
insertions, a mapping from the compact set [0..N) to any multiset of
N values is indeed faster and less memory-hungry as a sequence, typically
a list, than as a dictionary -- that's easy to benchmark, too.

> - how many subclassing can i use ? (is this time or memory consuming in
> any ways ?)

There's no limit to subclassing in either breadth or depth -- any class
may have any number of direct/immediate bases (breadth) and the inheritance
tree can be of arbitrary depth.  Memory costs are per-class and thus most
often irrelevant (generally it's only per-instance costs that you need
worry about -- most programs tend to have many more instances than classes).
Time costs are indeed high (using ordinary classes of either classic on
2.2-new object model) since inherited lookups are performed at runtime.
If your class has N bases, direct or indirect (depth or breadth, no real
matter here) and all are equally likely to be the location of any given
attribute, a typical lookup has O(N) cost.  It wouldn't be hard to write
a custom metaclass to 'flatten' this (at some, still per-class, memory
cost -- and substantial cost in 'dynamism', since behavior would then be
fixed at class-creation time rather than dynamically), were profiling to
show that this is the program's bottleneck (unlikely, but, I guess, not
impossible).

> and so on...
> 
> i didn't find such answers in PEP, etc. or maybe i missed them,

That's probably because typical Python users don't focus on such
micro-optimization issues all that much.  Make it work, then make
it right, then make it fast.  After a program is developed and fully
functional, and after it's refactored to the best architecture
possible, THEN (if performance is not acceptable at this point)
one starts profiling to determine where optimization would be of
use (and even then, it's more likely that big-O issues, or at any rate 
issues best addressed by wholesale rearchitecting/refactoring rather than 
micro level tweaking, yield the highest gains, for the first few iterations
of any such optimization effort).

Let's take a simple example.  Optionally in Python 2.1 and 2.2, and
by default in 2.3, Python allocates small objects via pymalloc.  It's
thus easy to make a little patch to learn about how many bytes are
actually allocated for each given object, itself -- quite apart from
its items and attributes.  I've done so in my Python 2.3 cvs tree
image and exposed the function as sys.bytes for demonstratory purposes.

>>> for i in range(5): print i, sys.bytes(range(i))
...
0 32
1 32
2 32
3 32
4 32

As you see, any list object, per se, takes exactly 32 bytes on my
platform (Linux on a 32-bit Athlon chip) -- clearly the slots for
the items are being kept elsewhere, so the information "a list object
takes exactly 32 bytes" is absolutely of no possible use.

>>> for i in range(5): print i, sys.bytes(tuple(range(i)))
...
0 24
1 32
2 32
3 40
4 40

Tuples, on the other hand, as we see, are keeping their itemslots
as part of "the object itself" -- taking 4 bytes per slot (on this
machine -- 32 bit architecture, so, 4 bytes per pointer), but
pymalloc rounds up to a multiple of 8 bytes (most system malloc
implementations round like that, or even more grossly).  So, a
tuple of length N takes 24+4*N (rounded up to multiple of 8)
bytes -- plus, of course, whatever its *items* take up (and this
entirely depends on how you debit the space for items that would
be in memory anyway even when this tuple disappeared).

>>> sys.bytes(1)
-2

We can't know this way how many bytes integer 1 takes up, because
it's not being handled by pymalloc (that's what -2 means in my
rapidly kludged up sys.bytes function).  And indeed, studying the
Python sources will show you how int's are special-cased -- and
the "freelist" strategy they use for their own allocation.

Similarly, you'll see any dictionary, per se, takes 136 bytes;
an instance of any class whatsoever, 32 (...which *of course*
ignores its __dict__ or __slots__...!); strings of length 0,
24 bytes -- length 1 to 8, 32 bytes -- length 9 to 16, 40 bytes --
and so on (24+N rounded up to a multiple of 8, in general); &c.

I hope it's pretty clear by now, why this is of so little use
(which is why I'm not going to check in my "sys.bytes" patch and
no doubt why AFAIK nobody even bothered doing so previously).

Alex