how to build a dict including a large number of data

Chris cwitts at gmail.com
Fri Jan 4 09:07:08 EST 2008


On Jan 4, 3:57 pm, wanzathe <wanza... at gmail.com> wrote:
> hi everyone
> i'm a newbie to python :)
> i have a binary file named test.dat including 9600000 records.
> the record format is int a + int b + int c + int d
> i want to build a dict like this: key=int a,int b  values=int c,int d
> i choose using bsddb and it takes about 140 seconds to build the dict.
> what can i do if i want to make my program run faster?
> or is there another way i can choose?
> Thanks in advance.
>
> My Code:
> -----------------------------------------------------------------------------------
> my_file = file('test.dat','rb')
> content = my_file.read()
> record_number = len(content) / 16
>
> db  = bsddb.btopen('test.dat.db','n',cachesize=500000000)
> for i in range(0,record_number):
>     a = struct.unpack("IIII",content[i*16:i*16+16])
>     db['%d_%d' % (a[0],a[1])] = '%d_%d' % (a[2],a[3])
>
> db.close()
> my_file.close()

my_file = file('test.dat','rb')
db  = bsddb.btopen('test.dat.db','n',cachesize=500000000)
content = myfile.read(16)
while content:
    a = struct.unpack('IIII',content)
    db['%d_%d' % (a[0],a[1])] = '%d_%d' % (a[2],a[3])
    content = myfile.read(16)

db.close()
my_file.close()

That would be more memory efficient, as for speed you would need to
time it on your side.



More information about the Python-list mailing list