DOM text

Robert Kern rkern at ucsd.edu
Fri Aug 26 06:10:25 EDT 2005


Richard Lewis wrote:
> Hello Pythoners,
> 
> I'm currently writing some Python to manipulate a semi-structured XML
> document. I'm using DOM (minidom) and I've got working code for
> transforming the document to HTML files and for adding the 'structured'
> elements which populate the higher regions of the tree (i.e. near the
> root).
> 
> What I have to do next is write some code for working with the 'less
> structured' elements towards the 'leaf ends' of the tree. These are
> rather like little sub-documents and contain a mixture of text with
> inline formatting (for links, font styles, headings, paragraphs etc.)
> and objects (images, media files etc.).

You might find that the more Pythonic XML modules are better suited to
handling mixed content. I've been using lxml and ElementTree quite
successfully. Amara should also be particularly well-suited to handling
mixed content, but I haven't used it in anger, yet.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter




More information about the Python-list mailing list