Parsing of nested tags
Stefan Schwarzer
s.schwarzer at ndh.net
Fri Mar 3 18:03:02 EST 2000
Hello :-)
Some time ago I have written one of the zillion programs that read some
kind of format file(s) and make HTML from them. The program is able to
convert <<I italic text>> to <I>italic text</I> or <<LINK link;text>> to
<A HREF="link">text</A>.
However, currently I can't convert <<LINK link;<<I italic text>>>> to
<A HREF="link"><I>italic text</I></A>. The relevant code is
-----8<---------------------------------------------------------------
######################################################################
# perform substitutions
# <<link url;url_text>>, url_text defaults to url
link_pattern = re.compile( '(?si)<<link (.+?)(?:;(.*?))?>>' )
def make_link( matchobj ):
url, url_text = matchobj.groups()
if not url_text: # use url as url_text by default
url_text = url
url, url_text = map( string.strip, [ url, url_text ] )
return string.join( (
html_format.link_format[ 0 ],
url,
html_format.link_format[ 1 ],
url_text,
html_format.link_format[ 2 ] ), '' )
# evaluate some formatting in the text to legal code
def make_html( text ):
# order matters, - conversion to links has to be come first
text = re.sub( link_pattern, make_link, text )
text = re.sub( r'<<(\S+)\s(.*?)>>', r'<\1>\2</\1>', text )
text = re.sub( r'(?i)<PROG>(.*?)</PROG>', r'<EM>\1</EM>', text )
text = re.sub( r'(?i)<FILE>(.*?)</FILE>', r'<EM>\1</EM>', text )
text = re.sub( r'(?i)<OPT>(.*?)</OPT>', r'<STRONG>\1</STRONG>', text )
return text
-----8<---------------------------------------------------------------
Now the question: Which is the best way to enable parsing of recursive
parsing as mentioned in the example above?
So far I have thought of two ways. One may be to extend the regular
expression(s), but this is already cumbersome to read. The other
possibility would be to scan the string and replace <<...>> occurences
which don't contain <<, perhaps multiple times, until all patterns are
substituted.
I hope there is an easy way that I simply have overlooked 8-) .
Any suggestions are appreciated. Thank you in advance :) .
Stefan
More information about the Python-list
mailing list