[Tutor] multiline regular expressions on endless data stream
Duncan Gibson
duncan at thermal.esa.int
Wed Apr 28 12:32:23 EDT 2004
I have a Perl utility which needs to be rewritten in something readable :-)
What I'm currently doing is:
read a line into a string
if line contains start of regular expression marker
read more lines until end of regular expression marker
join lines into strin
extract data from string
repeat
Unfortunately this piecemeal approach to matching the regular expression
makes the logic very messy, which in Perl means that its very very messy.
Is there a simpler way of doing this in Python without just duplicating
the messy logic? I've been searching through the books and archives but
haven't quite seen what I want, which is more along the lines of:
discard lines until multiline regular expression matches
extract data from regular expression
repeat
All the examples either read a line and try to match within the line, or
read the whole file and ignore end of line when matching. I can't slurp
the whole data into memory using read() because it might be a data stream
rather than a fixed length file.
The regular expression is relatively simple. It's just the buffering
around it and writing a full parser/scanner that I'm trying to avoid.
Does anyone have any hints or tips?
Will I kick myself for overlooking the obvious?
Cheers
Duncan
More information about the Tutor
mailing list