determining available space for Float32, for instance

Robert Kern rkern at enthought.com
Thu May 25 03:02:48 EDT 2006


David Socha wrote:
> Robert Kern wrote: 

>>However, keeping track of the sizes of your arrays and the 
>>size of your datatypes may be a bit much to ask.
> 
> Exactly.  Building a duplicate mechanism for tracking this informaiton
> would be a sad solution.  Surely Python has access to the amount of
> memory being used by the different data types.  How can I get to that
> information?

I meant that you shouldn't bother doing any of this manually at all. *Using*
such a mechanism is going to be a sad solution much less building a duplicate
one. Instead, use a persistent data store like PyTables or possibly an SQL
database (but I do recommend PyTables for your use-case).

For numpy arrays, I showed you how to calculate the memory footprint (modulo the
bytes for the actual Python structure itself that contains the metadata, but
that's tiny compared to the actual array). I think there's a more general
function that tries to guess the number of bytes used, but it's not terribly
reliable, and I don't recommend its use. For example, how does one measure the
memory footprint of a Python list? Do you count the memory footprint of each of
the items? What if the items are repeated or shared between other objects?

>>[snip]
>>numpy (definitely not Numeric) does have a feature called 
>>record arrays which will allow you to deal with your agents 
>>much more conveniently:
>>
>>  http://www.scipy.org/RecordArrays
>>
>>Also, you will certainly want to look at using PyTables to 
>>store and access your data. With PyTables you can leave all 
>>of your data on disk and access arbitrary parts of it in a 
>>relatively clean fashion without doing the fiddly work of 
>>swapping chunks of memory from disk and back again:
>>
>>  http://www.pytables.org/moin
> 
> Do RecordArrays and PyTables work well together?  

Yes. Currently, PyTables uses numarray's implementation of record arrays to
represent HDF5 tables (which are essentially equivalent in structure to record
arrays). It can interact with numpy record arrays just fine. Eventually,
PyTables will be using numpy and numpy only.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco




More information about the Python-list mailing list