[Tutor] regex help
ish_ling
ish_ling at yahoo.com
Mon Feb 23 04:49:09 CET 2009
I have a string:
'a b c<H d e f gH> h<H i j kH>'
I would like a regex to recursively match all alpha letters that are between <H and [a-z]H>. That is, I would like the following list of matches:
['d', 'e', 'f', 'i', 'j']
I do not want the 'g' or the 'k' matched.
I have figured out how to do this in a multiple-step process, but I would like to do it in one step using only one regex (if possible). My multiple step process is first to use the regex
'(?<=H )[a-z][^H]+(?!H)'
with re.findall() in order to find two strings
['d e f ', 'i j ']
I can then use another regex to extract the letters out of the strings. But, as I said above I would prefer to do this in one swoop.
Another example:
'a b c<H dH>'
There should be no matches.
Last example:
'a b c<H d eH>'
There should be one match:
['d']
(For background, although it's probably irrelevant, the string is a possible representation of a syllable (a, b, c, etc.) to tone (H) mapping in tonal languages.)
If anyone has ideas, then I would greatly appreciate it.
More information about the Tutor
mailing list