binary file compare...

Martin martin at marcher.name
Wed Apr 15 01:54:20 EDT 2009


Hi,

On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards <invalid at invalid> wrote:
> On 2009-04-13, SpreadTooThin <bjobrien62 at gmail.com> wrote:
>
>> I want to compare two binary files and see if they are the same.
>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>> that it is doing a byte by byte comparison of two files to see if they
>> are they same.
>
> Perhaps I'm being dim, but how else are you going to decide if
> two files are the same unless you compare the bytes in the
> files?

I'd say checksums, just about every download relies on checksums to
verify you do have indeed the same file.

>
> You could hash them and compare the hashes, but that's a lot
> more work than just comparing the two byte streams.

hashing is not exactly much mork in it's simplest form it's 2 lines per file.

$ dd if=/dev/urandom of=testfile.data bs=1M count=5
5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 1.4491 s, 3.6 MB/s
$ dd if=/dev/urandom of=testfile2.data bs=1M count=5
5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 1.92479 s, 2.7 MB/s
$ cp testfile.data testfile3.data
$ python
Python 2.5.4 (r254:67916, Feb 17 2009, 20:16:45)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> sha = hashlib.sha256()
>>> sha.update(file("testfile.data").read())
>>> sha.hexdigest()
'a0a8b5d1fd7b8181e0131fff8fd6acce39917e4498c86704354221fd96815797'
>>> sha2=hashlib.sha256()
>>> sha2.update(file("testfile2.data").read())
>>> sha2.hexdigest()
'25597380f833f287e8dad936b15ddb616669102c38f54dbd60ce57998d99ad3b'
>>> sha3=hashlib.sha256()
>>> sha3.update(file("testfile3.data").read())
>>> sha3.hexdigest()
'a0a8b5d1fd7b8181e0131fff8fd6acce39917e4498c86704354221fd96815797'
>>> sha.hexdigest() == sha2.hexdigest()
False
>>> sha.hexdigest() == sha3.hexdigest()
True
>>> sha2.hexdigest() == sha3.hexdigest()
False
>>>



-- 
http://soup.alt.delete.co.at
http://www.xing.com/profile/Martin_Marcher
http://www.linkedin.com/in/martinmarcher

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html



More information about the Python-list mailing list