md5 and large files

Josiah Carlson jcarlson at uci.edu
Sun Oct 17 21:25:07 EDT 2004


> If all you want to do is verify that a file is not corrupt, MD5 is the
> wrong algorithm to use. Use something fast like crc32.

CRC32 is only useful for detecting line transmission errors on
relatively small blocks of data, and even then, it does poorly.

MD5 is also fairly quick.  I can compute md5 checksums at roughly
10megs/second with a 400 mhz processor.


> If you really need it to be efficient, don't use Python. Use a native
> program like md5sum or sum or something.

Since the md5 module is implemented in C, the only slow part is the few
lines of Python and perhaps IO; though Python has IO speed comparable to
C.


 - Josiah




More information about the Python-list mailing list