Pattern Matching

Eddie Corns eddie at holyrood.ed.ac.uk
Tue Jul 20 07:21:02 EDT 2004


"Greg Lindstrom" <greg.lindstrom at novasyshealth.com> writes:

>Hello-

>I'm running Python 2.2.3 on Windows XP "Professional" and am reading a file
>wit 1 very long line of text (the line consists of multiple records with no
>cr/lf).  What I would like to do is scan for the occurrence of a specific
>pattern of characters which I expect to repeat many times in the file.
>Suppose I want to search for "Start: mm/dd/yy" and capture the mm/dd/yyyy
>data for processing each time I find it.  This is the type of problem I used
>to solve with <duck>Perl<\duck> in a former lifetime using regular
>expressions.  The following does not work, but is the flavor of what I want
>to do:

>long_line_of_text = 'Start: 1/1/2004 and some stuff.~Start: 2/3/2004 stuff.
>~Start 5/1/2004 morestuff.~'
>while re.match('Start:\ (\D?/\D?/\D+)', long_line_of_text):
>    # process the date string here which I hoped to catch in the parenthesis
>above.

>I'd like this to keep matching and processing the string as long as it keeps
>matching the pattern, bopping down the string as it goes.

>Another way to handle this is to replace all of the tildes with linefeeds
>(tildes are the end of segment marker), or split the records on the tilde
>and go from there.  I'd just like to know how I could do it with the regular
>expressions.

In addition to previous answers, a useful resource might be:
  http://gnosis.cx/TPiP/



More information about the Python-list mailing list