regular expression

gardsted gardsted at yahoo.com
Mon Nov 19 05:26:01 EST 2007


The retarded cousin - that's me!

I keep getting confused by the caret - sometimes it works - sometimes it's better with backslash-n
Yes - retarded cousin, I guess.

The file format is a config-track for a multitrack recording software, which i need to automate a bit.
I can start it from the command line and have it create a remix (using various vst and other effects)
Sometimes, however, we may have deleted the 'guitar.wav' and thus have to leave
out that track from the config-file or the rendering won't work.

Since it seems 'whitespace matters' in the file I have the following code to get me a tag:
I cost me a broken cup and coffee all over the the kitchen tiles - temper!

I still don't understand why I have to use \n instead of ^ af the start of TAGCONTENTS and TAGEND.
But I can live with it!

Thank you for your kind and humorous help!
kind retards
jorgen / de mente
www.myspace.com/dementedk
------------------------------------------------------------

import re

TESTTXT=open('003autoreaper.rpp').read() # whole file now

def getLevel(levl):
     rex = re.compile(
         r'(?m)'                                            # multiline
         r'(?P<TAGSTART>^ {%d}[<])'                         # the < character
         r'(?P<TAGNAME>[a-zA-Z0-9_]*)'                      # the tagname
         r'(?P<TAGDATA>[\S \t]*?$)'                         # the rest of the tagstart line
         r'(?P<TAGCONTENTS>(\n {%d}[^>][\S \t]*$){0,})'     # all the data coming before the >
         r'(?P<TAGEND>\n {%d}>[\S \t]*$)' %(levl,levl,levl) # the > character
         )
     return rex

for i in getLevel(2).finditer(TESTTXT):
     myMatch = i.groupdict()
     print i.group('TAGNAME'),i.start('TAGSTART'), i.end('TAGEND')
     #print i.groups()
     if myMatch['TAGNAME'] == 'TRACK':
         #print i.groups()
         for j in getLevel(6).finditer(TESTTXT,i.start('TAGSTART'), i.end('TAGEND')):
             myMatch2 = j.groupdict()
             #print j.groups()
             print j.group('TAGNAME'),j.start('TAGSTART'), j.end('TAGEND')
             if myMatch2['TAGNAME'] == 'SOURCE':
                 for m in myMatch2:
                     print m, myMatch2[m]




More information about the Python-list mailing list