[Tutor] regex help

ish_ling ish_ling at yahoo.com
Mon Feb 23 04:49:09 CET 2009


I have a string:

    'a b c<H d e f gH> h<H i j kH>'

I would like a regex to recursively match all alpha letters that are between <H and [a-z]H>. That is, I would like the following list of matches:

    ['d', 'e', 'f', 'i', 'j']

I do not want the 'g' or the 'k' matched. 

I have figured out how to do this in a multiple-step process, but I would like to do it in one step using only one regex (if possible). My multiple step process is first to use the regex

    '(?<=H )[a-z][^H]+(?!H)'

with re.findall() in order to find two strings 

    ['d e f ', 'i j ']

I can then use another regex to extract the letters out of the strings. But, as I said above I would prefer to do this in one swoop.

 

Another example:

    'a b c<H dH>'

There should be no matches.

 

Last example:

    'a b c<H d eH>'

There should be one match:

    ['d'] 


(For background, although it's probably irrelevant, the string is a possible representation of a syllable (a, b, c, etc.) to tone (H) mapping in tonal languages.)

 
If anyone has ideas, then I would greatly appreciate it. 





      



More information about the Tutor mailing list