Efficient checksum calculating on lagre files

Robin Becker robin at reportlab.com
Tue Feb 8 11:13:43 EST 2005


Ola Natvig wrote:
> Hi all
> 
> Does anyone know of a fast way to calculate checksums for a large file. 
> I need a way to generate ETag keys for a webserver, the ETag of large 
> files are not realy nececary, but it would be nice if I could do it. I'm 
> using the python hash function on the dynamic generated strings (like in 
> page content) but on things like images I use the shutil's 
> copyfileobject function and the hash of a fileobject's hash are it's 
> handlers memmory address.
> 
> Does anyone know a python utility which is possible to use, perhaps 
> something like the md5sum utility on *nix systems.
> 
> 
well md5sum is usable on many systems. I run it on win32 and darwin.

I tried this in 2.4 with the new subprocess module

def md5sum(fn):
	import subprocess
	return subprocess.Popen(["md5sum.exe", fn], 
stdout=subprocess.PIPE).communicate()[0]

import time
t0 = time.time()
print md5sum('test.rml')
t1 = time.time()
print t1-t0

and got

C:\Tmp>md5sum.py
b68e4efa5e5dbca37718414f6020f6ff *test.rml

0.0160000324249


Tried with the original
C:\Tmp>timethis md5sum.exe test.rml

TimeThis :  Command Line :  md5sum.exe test.rml
TimeThis :    Start Time :  Tue Feb 08 16:12:26 2005

b68e4efa5e5dbca37718414f6020f6ff *test.rml

TimeThis :  Command Line :  md5sum.exe test.rml
TimeThis :    Start Time :  Tue Feb 08 16:12:26 2005
TimeThis :      End Time :  Tue Feb 08 16:12:26 2005
TimeThis :  Elapsed Time :  00:00:00.437

C:\Tmp>ls -l test.rml
-rw-rw-rw-   1 user     group      996688 Dec 31 09:57 test.rml

C:\Tmp>

-- 
Robin Becker




More information about the Python-list mailing list