binary file compare...

SpreadTooThin bjobrien62 at gmail.com
Mon Apr 13 16:46:18 EDT 2009


On Apr 13, 2:37 pm, Grant Edwards <invalid at invalid> wrote:
> On 2009-04-13, Grant Edwards <invalid at invalid> wrote:
>
>
>
> > On 2009-04-13, SpreadTooThin <bjobrie... at gmail.com> wrote:
>
> >> I want to compare two binary files and see if they are the same.
> >> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> >> that it is doing a byte by byte comparison of two files to see if they
> >> are they same.
>
> > Perhaps I'm being dim, but how else are you going to decide if
> > two files are the same unless you compare the bytes in the
> > files?
>
> > You could hash them and compare the hashes, but that's a lot
> > more work than just comparing the two byte streams.
>
> >> What should I be using if not filecmp.cmp?
>
> > I don't understand what you've got against comparing the files
> > when you stated that what you wanted to do was compare the files.
>
> Doh!  I misread your post and thought were weren't getting a
> warm fuzzying feeling _because_ it was doing a byte-byte
> compare. Now I'm a bit confused.  Are you under the impression
> it's _not_ doing a byte-byte compare?  Here's the code:
>
> def _do_cmp(f1, f2):
>     bufsize = BUFSIZE
>     fp1 = open(f1, 'rb')
>     fp2 = open(f2, 'rb')
>     while True:
>         b1 = fp1.read(bufsize)
>         b2 = fp2.read(bufsize)
>         if b1 != b2:
>             return False
>         if not b1:
>             return True
>
> It looks like a byte-by-byte comparison to me.  Note that when
> this function is called the file lengths have already been
> compared and found to be equal.
>
> --
> Grant Edwards                   grante             Yow! Alright, you!!
>                                   at               Imitate a WOUNDED SEAL
>                                visi.com            pleading for a PARKING
>                                                    SPACE!!

I am indeed under the impression that it is not always doing a byte by
byte comparison...
as well the documentation states:
Compare the files named f1 and f2, returning True if they seem equal,
False otherwise.

That word... Seeeeem... makes me wonder.

Thanks for the code! :)




More information about the Python-list mailing list