Calculate sha1 hash of a binary file

Tim Golden mail at timgolden.me.uk
Wed Aug 6 15:50:19 EDT 2008


LaundroMat wrote:
> Hi -
> 
> I'm trying to calculate unique hash values for binary files,
> independent of their location and filename, and I was wondering
> whether I'm going in the right direction.
> 
> Basically, the hash values are calculated thusly:
> 
> f = open('binaryfile.bin')
> import hashlib
> h = hashlib.sha1()
> h.update(f.read())
> hash = h.hexdigest()
> f.close()
> 
> A quick try-out shows that effectively, after renaming a file, its
> hash remains the same as it was before.
> 
> I have my doubts however as to the usefulness of this. As f.read()
> does not seem to read until the end of the file (for a 3.3MB file only
> a string of 639 bytes is being returned, perhaps a 00-byte counts as
> EOF?), is there a high danger for collusion?

Guess: you're running on Windows?

You need to open binary files by using open ("filename", "rb")
to indicate that Windows shouldn't treat certain characters --
specifically character 26 -- as special.

TJG



More information about the Python-list mailing list