binary file compare...

Grant Edwards invalid at invalid
Mon Apr 13 16:37:19 EDT 2009


On 2009-04-13, Grant Edwards <invalid at invalid> wrote:
> On 2009-04-13, SpreadTooThin <bjobrien62 at gmail.com> wrote:
>
>> I want to compare two binary files and see if they are the same.
>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>> that it is doing a byte by byte comparison of two files to see if they
>> are they same.
>
> Perhaps I'm being dim, but how else are you going to decide if
> two files are the same unless you compare the bytes in the
> files?
>
> You could hash them and compare the hashes, but that's a lot
> more work than just comparing the two byte streams.
>
>> What should I be using if not filecmp.cmp?
>
> I don't understand what you've got against comparing the files
> when you stated that what you wanted to do was compare the files.

Doh!  I misread your post and thought were weren't getting a
warm fuzzying feeling _because_ it was doing a byte-byte
compare. Now I'm a bit confused.  Are you under the impression
it's _not_ doing a byte-byte compare?  Here's the code:

def _do_cmp(f1, f2):
    bufsize = BUFSIZE
    fp1 = open(f1, 'rb')
    fp2 = open(f2, 'rb')
    while True:
        b1 = fp1.read(bufsize)
        b2 = fp2.read(bufsize)
        if b1 != b2:
            return False
        if not b1:
            return True
    
It looks like a byte-by-byte comparison to me.  Note that when
this function is called the file lengths have already been
compared and found to be equal.

-- 
Grant Edwards                   grante             Yow! Alright, you!!
                                  at               Imitate a WOUNDED SEAL
                               visi.com            pleading for a PARKING
                                                   SPACE!!



More information about the Python-list mailing list