md5 and large files

Paul Rubin http
Sun Oct 17 22:40:48 EDT 2004


Brad Tilley <rtilley at vt.edu> writes:
> I would like to verify that the files are not corrupt so what's the
> most efficient way to calculate md5 sums on 4GB files?

What kind of corruption are you talking about?  The best way is just
to run md5 over the old file.  You could use either the external
md5sum command, or use Python's md5 module.  You'd use the md5.update
operation to feed the file through md5 a few kbytes at a time, and
then md5.digest or md5.hexdigest at the end to get the checksum.  You
don't need to read the whole file into memory at once or anything like
that.



More information about the Python-list mailing list