Reading the first MB of a binary file

MRAB google at mrabarnett.plus.com
Sun Jan 25 12:05:00 EST 2009


Max Leason wrote:
 > Hi,
 >
 > I'm attempting to read the first MB of a binary file and then do a
 > md5 hash on it so that i can find the file later despite it being
 > moved or any file name changes that may have been made to it. These
 > files are large (350-1400MB) video files and i often located on a
 > different computer and I figure that there is a low risk for
 > generating the same hash between two files. The problem occurs in the
 > read command which returns all \x00s. Any ideas why this is
 > happening?
 >
 > Code:
 >>>>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)
 > b'\x00\x00\x00\x00\x00\x00....\x00'
 >
You're reading the first 1024 bytes. Perhaps the first 1024 bytes of the
file _are_ all zero!

Try reading more and checking those, eg:

>>> SIZE = 1024 ** 2
 >>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(SIZE) == 
b'\x00' * SIZE



More information about the Python-list mailing list