os.stat - time format conversion + md5

Jeff Epler jepler at unpythonic.net
Tue Dec 17 11:33:13 EST 2002


On Tue, Dec 17, 2002 at 06:47:13AM -0800, DP wrote:
> The problem is, the files are large, so I'd prefer to not read the
> file, and then pass the contents of this file to md5 as a string. What
> are my options? Is there any other function (other than CRC) that will
> allow me to verify signatures on files?

If you're worried about holding the whole file in memory, use this:

    import md5
    def file_md5(filename, chunksize=4096):
        f = open(filename)
        m = md5.new()
        while 1:
            chunk = f.read(chunksize)
            if not chunk:
                break
            m.update(chunk)
        return m
        # or return m.hexdigest() or return m.digest() or ..

It'll read the file in nice chunks, and give you the md5sum of the whole
file.

You could also use the "sha" module.  Some people believe that md5 has
(theoretical) weaknesses that sha doesn't, which may make it
significantly easier to deliberately create two files with the same
md5sum than the number of bits in the digest suggests.  The code would
be the same, except you'd use 'sha.new' instead of 'md5.new'

Jeff




More information about the Python-list mailing list