Efficient MD5 (or similar) hashes

Bengt Richter bokr at oz.net
Sun Dec 7 22:36:57 EST 2003


On Sun, 07 Dec 2003 17:21:04 -0800, Erik Max Francis <max at alcyone.com> wrote:

>Kamus of Kadizhar wrote:
>
>> I want to check the integrity of the files after transfer.  I can
>> check
>> the obvious - date, file size - quickly, but what if I want an MD5
>> hash?
>>
>>  From reading the python docs, md5 reads the entire file as a string.
>> That's not practical on a 1 GB file that's network mounted.
>
>Python's md5 module just accepts updating strings; the driving code
>certainly doesn't have to read the file all in at once.  Just read it in
>a chunk at a time:
>
PMJI, but don't forget to open the file in binary,
e.g., theFile = file(thePath, 'rb'), if you're on windows.

>	hasher = md5.new()
>	while True:
>	    chunk = theFile.read(CHUNK_SIZE)
>	    if not chunk:
>	        break
>	    hasher.update(chunk)
>	theHash = hasher.hexdigest()
>

Regards,
Bengt Richter




More information about the Python-list mailing list