[Tutor] Reading Entire File

Michael P. Reilly arcege@shore.net
Tue, 13 Mar 2001 17:01:30 -0500 (EST)


> I want to read the entire contents of a file into a string and then compute
> the CRC using crc32() in module zlib. I am using the file method read() to
> read the file. It works as expected for text files, but for Microsoft Excel
> and Word files, it only reads a few characters. 
> How can I read the entire file? 
> 
> I have Windogs 95 and Python 2.0.

You should be able to read blocks at a time instead of the whole file.
If, for example, the file was 10 meg, then you'd have to have that much
memory to hold the file; it would be more efficient to read in bits and
perform the crc checks on that.

>>> import zlib
>>> f = open('foo.dat', 'rb')
>>> whole_contents = f.read()
>>> crc_on_whole = zlib.crc32(whole_contents)
>>> f.seek(0)  # rewind to the beginning of the file

>>> block = f.read(8192)
>>> crc_by_block = 0
>>> while block:
...   crc_by_block = zlib.crc32(block, crc_by_block)
...   block = f.read(8192)
...
>>> crc_on_whole == crc_by_block
1
>>> crc_on_whole, crc_by_block
(395051047, 395051047)

For CRC checks, you should always be working with the binary formats to
get all characters (that would be transmitted, if for that purpose).

Good luck,
  -Arcege

-- 
------------------------------------------------------------------------
| Michael P. Reilly, Release Manager  | Email: arcege@shore.net        |
| Salem, Mass. USA  01970             |                                |
------------------------------------------------------------------------