optimizing memory utilization

Anon anon at ymous.com
Thu Sep 16 21:51:00 EDT 2004


On Wed, 15 Sep 2004 14:59:45 +0000, John Lenton wrote:

> On Tue, Sep 14, 2004 at 04:39:49AM +0000, Anon wrote:
>> Hello all,
>> 
>> I'm hoping for some guidance here...  I am a c/c++ "expert", but a
>> complete python virgin.  I'm trying to create a program that loads in
>> the entire FreeDB database (excluding the CDDBID itself) and uses this
>> "database" for other subsequent processing.  The problem is, I'm
>> running out of memory on a linux RH8 box with 512MB.  The FreeDB
>> database that I'm trying to load is presently consisting of two "CSV"
>> files.  The first file contains a "list" of albums with artist name and
>> an arbitrary sequential album ID an the CDDBID (an ascii-hex
>> representation of a 32-bit value). The second file contains a list of
>> all of the tracks on each of the albums, crossreferenced via the album
>> ID.  When I load into memory, I create a python list where each entry
>> in the list is itself a list representing the data for a given album. 
>> The album data list consists of a small handful of text items like the
>> album title, author, genre, and year, as well as a list which itself
>> contains a list for each of the track on the album.
>> 
>> [[<Alb1ID#>, '<Alb1Artist>', '<Alb1Title>', '<Alb1Genre>','<Alb1Year>',
>>   [["Track1", 1], ["Track2", 2], ["Track3", 3], ..., ["TrackN",N]],
>>  [<Alb2ID#>, '<Alb2Artist>', '<Alb2Title>', '<Alb2Genre>','<Alb2Year>',
>>   [["Track1", 1], ["Track2", 2], ["Track3", 3], ..., ["TrackN",N]],
>>     ...
>>  [<AlbNID#>, '<AlbNArtist>', '<AlbNTitle>', '<AlbNGenre>','<AlbNYear>',
>>   [["Track1", 1], ["Track2", 2], ["Track3", 3], ..., ["TrackN",N]]]]
> 
> silly question: have you looked into using tuples instead of lists for
> the inner objects? They're supposed to be more memory efficient,
> although I have found that that isn't necessarily the case.

That's exactly the kind of feedback that I was looking for when I asked
the question.  However the suggestion that MusicBrainz already does what I
want in a possibly more accurate way looks like it might be an even better
suggestion, leaving me to concentrate my efforts on some other
as-yet-unsolved problem instead!

Thanks!
Wes



More information about the Python-list mailing list