Efficient checksum calculating on lagre files
Thomas Heller
theller at python.net
Tue Feb 8 13:12:45 EST 2005
Nick Craig-Wood <nick at craig-wood.com> writes:
> Ola Natvig <ola.natvig at infosense.no> wrote:
>> Hi all
>>
>> Does anyone know of a fast way to calculate checksums for a large file.
>> I need a way to generate ETag keys for a webserver, the ETag of large
>> files are not realy nececary, but it would be nice if I could do it. I'm
>> using the python hash function on the dynamic generated strings (like in
>> page content) but on things like images I use the shutil's
>> copyfileobject function and the hash of a fileobject's hash are it's
>> handlers memmory address.
>>
>> Does anyone know a python utility which is possible to use, perhaps
>> something like the md5sum utility on *nix systems.
>
> Here is an implementation of md5sum in python. Its the same speed
> give or take as md5sum itself. This isn't suprising since md5sum is
> dominated by CPU usage of the MD5 routine (in C in both cases) and/or
> io (also in C).
Your code won't work correctly on Windows, since you have to open files
with mode 'rb'.
But there's a perfect working version in the Python distribution already:
tools/Scripts/md5sum.py
Thomas
More information about the Python-list
mailing list