Looping through a file a block of text at a time not by line

Wed Jun 14 03:25:57 EDT 2006

Rosario Morgan wrote:
> Hello
>
> Help is great appreciated in advance.
>
> I need to loop through a file 6000 bytes at a time.  I was going to
> use the following but do not know how to advance through the file 6000
> bytes at a time.
>
> file = open('hotels.xml')
> block = file.read(6000)
> newblock = re.sub(re.compile(r'<Rate.*?></Rate>'),'',block)
> print newblock
>
> I cannot use readlines because the file is 138MB all on one line.
>
> Suggestions?
>
> -Rosario

Probably a more terse way to do this, but this seems to work
import os

offset = 0
grab_size = 6000
file_size = os.stat('hotels.xml')[6]
f = open('hotels.xml', 'r')

while offset < file_size:
    f.seek(offset)
    data_block = f.read(grab_size)
    offset += grab_size
    print data_block
f.close()