md5 and large files

Brad Tilley rtilley at vt.edu
Sun Oct 17 12:34:07 EDT 2004


I have some large files (between 2 & 4 GB) that I want to do a few 
things with. Here's how I've been using the md5 module in Python:

                 original = file(path + f, 'rb')
                 data = original.read(4096)
                 original.close()
                 verify = md5.new(data)
                 print verify.hexdigest(), f

Is reading the first 4096 bytes of the files and calculating the md5 sum 
based on that sufficient for uniquely identifying the files or am I 
going about this totally wrong? Any advice or ideas appreciated.



More information about the Python-list mailing list