check if files are the same on Windows

kyosohma at gmail.com kyosohma at gmail.com
Mon Mar 19 13:35:26 EDT 2007


On Mar 19, 11:55 am, Shane Geiger <sgei... at ncee.net> wrote:
> In the unix world, 'fc' would be like diff.
>
> """
> Python example of checksumming files with the MD5 module.
>
> In Python 2.5, the hashlib module would be preferable/more elegant.
> """
>
> import md5
>
> import string, os
> r = lambda f: open(f, "r").read()
> def readfile(f,strip=False): return (strip and stripper(r(f))) or r(f)
> def writefile(f, data, perms=750): open(f, "w").write(data) and
> os.chmod(f, perms)
>
> def get_md5(fname):
>     hash = md5.new()
>     contents = readfile(fname)
>     hash.update(contents)
>     value = hash.digest()
>     return (fname, hash.hexdigest())
>
> import glob
>
> for f in glob.glob('*'):
>     print get_md5(f)
>
> > A crude way to check if two files are the same on Windows is to look
> > at the output of the "fc" function of cmd.exe, for example
>
> > def files_same(f1,f2):
> >     cmnd    = "fc " + f1 + " " + f2
> >     return ("no differences" in popen(cmnd).read())
>
> > This is needlessly slow, because one can stop comparing two files
> > after the first difference is detected. How should one check that
> > files are the same in Python? The files are plain text.
>
> --
> Shane Geiger
> IT Director
> National Council on Economic Education
> sgei... at ncee.net  |  402-438-8958  |  http://www.ncee.net
>
> Leading the Campaign for Economic and Financial Literacy
>
>  sgeiger.vcf
> 1KDownload

You can also use Python's file "read" method to read a block of each
file in a loop in binary mode.

Something like:

file1 = open(path1, 'rb')
file2 = open(path2, 'rb')

bytes1 = file1.read(blocksize)
bytes2 = file2.read(blocksize)

And then just compare bytes to see if there is a difference. If so,
break out of the loop. I saw this concept in the book: Python
Programming, 3rd Ed. by Lutz.

Have fun!

Mike




More information about the Python-list mailing list