SGML to Python memory tree
François Pinard
pinard at iro.umontreal.ca
Wed May 17 08:03:21 EDT 2000
Hi, gang. Once started on sharing little pieces of code :-).
For the Translation Project, I have some Python code that reads `nsgmls'
output into a memory tree. It does not process attributes, as I did
not have any in my little application. This code is surprisingly short,
given what it does. (It had to work for Python 1.5.1, that's why it works
around the missing `LIST.pop()').
def _(text):
return text
def read_sgml_file(name):
stack = []
current = []
# Avoid docbk30, which raises some unanalysed interference.
for line in os.popen('SGML_CATALOG_FILES= nsgmls %s' % name).readlines():
if line[0] == '(':
stack.append(current)
current = [string.lower(line[1:-1])]
continue
if line[0] == ')':
element = tuple(current)
current = stack[-1]
del stack[-1]
current.append(element)
continue
if line[0] == '-':
line = line[1:-1]
line = string.replace(line, '\\n', '\n')
line = string.replace(line, '\\011', '\t')
line = string.rstrip(line)
current.append(line)
continue
if line[0] == 'C':
return current[0]
sys.stderr.write(_("SGML in `%s' is not conformant.\n") % name)
--
François Pinard http://www.iro.umontreal.ca/~pinard
More information about the Python-list
mailing list