[Tutor] multiline regular expressions on endless data stream

Duncan Gibson duncan at thermal.esa.int
Wed Apr 28 12:32:23 EDT 2004


I have a Perl utility which needs to be rewritten in something readable :-)

What I'm currently doing is:
	read a line into a string
	if line contains start of regular expression marker
		read more lines until end of regular expression marker
		join lines into strin
		extract data from string
	repeat

Unfortunately this piecemeal approach to matching the regular expression
makes the logic very messy, which in Perl means that its very very messy.

Is there a simpler way of doing this in Python without just duplicating
the messy logic? I've been searching through the books and archives but
haven't quite seen what I want, which is more along the lines of:

	discard lines until multiline regular expression matches
	extract data from regular expression
	repeat

All the examples either read a line and try to match within the line, or
read the whole file and ignore end of line when matching. I can't slurp
the whole data into memory using read() because it might be a data stream
rather than a fixed length file.

The regular expression is relatively simple. It's just the buffering
around it and writing a full parser/scanner that I'm trying to avoid. 

Does anyone have any hints or tips?
Will I kick myself for overlooking the obvious?
		
Cheers
Duncan





More information about the Tutor mailing list