Two dimensional regexp matching?

Paddy paddy3118 at tiscali.co.uk
Sat Jul 27 07:36:03 EDT 2002


We already have the re module for regular expression matching on a string.

I am looking for pointers to references/algorithms for regular expression matching for
files of tabular data, i.e.

     Table definition
     ================
     1) Samples from one point in the system appears in a column of the table.
     2) Samples encoded as characters
     3) All points in the system are sampled at the same time to produce successive
        rows of the table

So a system sampled at two points in successively may produce the following file:

     GH
     DF
     AS
     QW
     FF
     SD

I want to be able to do regular expression type searches within the file. Things like
  Where can I find point1 == (D or G) then point2 == W within three samples and where the
next sample of point2 != the earlier sample of point1?

That was a small example, in reality there is usually hundreds of points and tens of
thousands of samples in multi-megabyte files but I'd first like to see if anyone else has
considered this kind of 'two dimensional regexp matching'

Note: I DO NOT have queries in the date on sample points. The queries will always be "Find
the range of sample times in which 'this' occurs".

U have tried Google but without success - I don't know enough to think of a suitable
search phrase, or, (much less likely), Google doesn't have it ;-)


Thanks in advance, Paddy.




More information about the Python-list mailing list