Efficient checksum calculating on lagre files
Robin Becker
robin at reportlab.com
Tue Feb 8 11:13:43 EST 2005
Ola Natvig wrote:
> Hi all
>
> Does anyone know of a fast way to calculate checksums for a large file.
> I need a way to generate ETag keys for a webserver, the ETag of large
> files are not realy nececary, but it would be nice if I could do it. I'm
> using the python hash function on the dynamic generated strings (like in
> page content) but on things like images I use the shutil's
> copyfileobject function and the hash of a fileobject's hash are it's
> handlers memmory address.
>
> Does anyone know a python utility which is possible to use, perhaps
> something like the md5sum utility on *nix systems.
>
>
well md5sum is usable on many systems. I run it on win32 and darwin.
I tried this in 2.4 with the new subprocess module
def md5sum(fn):
import subprocess
return subprocess.Popen(["md5sum.exe", fn],
stdout=subprocess.PIPE).communicate()[0]
import time
t0 = time.time()
print md5sum('test.rml')
t1 = time.time()
print t1-t0
and got
C:\Tmp>md5sum.py
b68e4efa5e5dbca37718414f6020f6ff *test.rml
0.0160000324249
Tried with the original
C:\Tmp>timethis md5sum.exe test.rml
TimeThis : Command Line : md5sum.exe test.rml
TimeThis : Start Time : Tue Feb 08 16:12:26 2005
b68e4efa5e5dbca37718414f6020f6ff *test.rml
TimeThis : Command Line : md5sum.exe test.rml
TimeThis : Start Time : Tue Feb 08 16:12:26 2005
TimeThis : End Time : Tue Feb 08 16:12:26 2005
TimeThis : Elapsed Time : 00:00:00.437
C:\Tmp>ls -l test.rml
-rw-rw-rw- 1 user group 996688 Dec 31 09:57 test.rml
C:\Tmp>
--
Robin Becker
More information about the Python-list
mailing list