binary file comparison with the md5 module

Christian Reyes christian at rocketnetwork.com
Wed Jun 13 14:22:01 EDT 2001


eureka!
after some more research i have discovered the very handy "filecmp" module.
problem solved.
cheers,
christian

"Christian Reyes" <christian at rocketnetwork.com> wrote in message
news:9g8ahr$s6t$1 at bob.news.rcn.net...
> I'm trying to write a script that takes two binary files and returns
whether
> or not their data is completely matching.
>
> One of my peers suggested that an efficient way to do this would be to run
> the md5 algorithm on each file and then compare the resultant output.
Since
> md5 returns a unique 128-bit checksum of it's input, this should
> theoretically work.
>
> The problem i'm having is with reading the binary file in as a string.
>
> I tried opening the file with the built-in python open command, and then
> reading the contents of the file into a buffer.  But I think my problem is
> that when I read the binary file into a buffer, the contents get tweaked
> somehow.  I would expect the print statement to give me some huge string
of
> gibberish but instead what I get is 'RIFFnap'.  Regardless of what size
the
> file is.  I'll try to read in a 5 meg file and all I get when I try to
print
> the buffer is some variation of 'RIFFxxx' (where xxx is any arbitrary set
of
> 3 characters).
>
> >>> x = open('d:\\binary.wav')
> >>> buf = x.read()
> >>> print buf
> 'RIFFnap'
>
> Anyway, if any of you have a better suggestion for me, I'd really
appreciate
> it.
>
> Basically all i'm looking for is an efficient method of comparing binary
> data files.
>
> Thanks for your time,
> christian
>
>





More information about the Python-list mailing list