Reading variable length records...
Brian Quinlan
BrianQ at ActiveState.com
Wed Sep 12 18:36:16 EDT 2001
> I'm trying to read records from a 2 GB datafile, but my brain has
> stopped working, so I was wondering if someone has allready
> solved this problem. The records are variable length and are
> separated by a five character delimiter. I was trying to use
> file.read(n) with a blocksize of ~1Mb, but got a serious
> brainfart when trying to think of how to handle the case where
> only part of the delimiter was read in the current block.
Here is some pseudo-code to get you started:
data = ''
records = []
while 1:
readData = datafile.read(size)
if not readData:
break
data += readData
partialRecords = data.split('12345')
records += partialRecords[:-1] # Last record is incomplete
data = records[-1]
if data:
# Hmmm, there is still data left over, probably bad
The basic idea is that you use split to collect as many records as
possible and just keep the left-over partial record for the next
round. Let me know if you need clarification.
Cheers,
Brian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 2220 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20010912/7c9c7ba6/attachment.bin>
More information about the Python-list
mailing list