BUG? sha-moduel returns same crc for different files

Sat Sep 16 23:07:59 EDT 2000

Thomas Weholt wrote:
> 
> In article <3d7l8drbd7.fsf at kronos.cnri.reston.va.us>, Andrew Kuchling
> <akuchlin at mems-exchange.org> wrote:
> > Fascinating.  I'll bet that the problem is that you're not opening the
> > files in binary mode, so the .read() is hitting an EOF (byte 26) early
> > in both files, and this prefix is the same.  You can check this by doing
> > 'data1=open(filename1).read() ; data2=...' and then comparing data1 and
> > data2.
> >
> > In that case, the fix is to use open(filename1, 'rb').
> >
> > --amk
> >
> 
> Well, that didn't change much. :-<
> 
> d1 = open(filename,'r').read()
> d2 = open('filename,'rb').read()
> 
> doing a len(d1) == len(d2) returns true, so to me it looks like both methods
> reads equal amounts of data, and d1 == d2 equals true too.

It seems very likely the file you are reading above does *not* contain
the ^Z/EOF/0x1A byte Andrew referred to.  In that case, the two methods
of reading the file *will* return the same thing.  It must be the other
file you were comparing that contains character 26.

Only on a file to which the following program returns 'Contains ^Z!'
will the above show different lengths or content... (Save as test.py and
type "python test.py filename" at DOS prompt.)

------------------
import sys

bytes = open(sys.argv[1], 'rb').read()

if chr(26) in bytes:
    print 'Contains ^Z!'
else:
    print 'No dice, mate...'
------------------

-- 
Peter Hansen