Efficient checksum calculating on lagre files
Michael Hoffman
cam.ac.uk at mh391.invalid
Tue Feb 8 11:27:20 EST 2005
Ola Natvig wrote:
> Does anyone know of a fast way to calculate checksums for a large file.
> I need a way to generate ETag keys for a webserver, the ETag of large
> files are not realy nececary, but it would be nice if I could do it. I'm
> using the python hash function on the dynamic generated strings (like in
> page content) but on things like images I use the shutil's
> copyfileobject function and the hash of a fileobject's hash are it's
> handlers memmory address.
>
> Does anyone know a python utility which is possible to use, perhaps
> something like the md5sum utility on *nix systems.
Is there a reason you can't use the sha module? Using a random large file I had
lying around:
sha.new(file("jdk-1_5_0-linux-i586.rpm").read()).hexdigest() # loads all into memory first
If you don't want to load the whole object into memory at once you can always call out to the sha1sum utility yourself as well.
>>> subprocess.Popen(["sha1sum", ".bashrc"], stdout=subprocess.PIPE).communicate()[0].split()[0]
'5c59906733bf780c446ea290646709a14750eaad'
--
Michael Hoffman
More information about the Python-list
mailing list