DOM text to xml aarrgghhh!!!!

Bengt Richter bokr at oz.net
Thu Jun 12 01:42:48 EDT 2003


On Wed, 11 Jun 2003 19:59:50 -0600, Steven Taschuk <staschuk at telusplanet.net> wrote:

>Quoth huntermorgan:
>  [...]
>> can anybody help???? below is the code and below that is a small
>> snipet of the course outline
>
>Excellent problem report!
>
>I was only able to duplicate your problem if I munged the tab
>characters in the text into spaces (which I did quite
>accidentally).  Without tabs,
>
>>     tagSequence = re.compile("(^\d+)\t+")
>
>never matches, so of course the document is empty.
>
>If tabs are present as the text of your note indicates, the result
>on my machine is not what you report -- an XML document with just
>a root node -- but an exception.  (A NameError, to be precise.)
>If that's fixed, there's an exception for having too many root
>elements in the XML document.  Since you don't report any of this,
>I assume you're seeing a tab-munging problem or some such.
>
>A quick and dirty way to start figuring out what's wrong: add
>
>        s = line
>        print 'processing line:', repr(s)      # this
>        target = tagSequence.search(s)
>        print 'target is', repr(target)        # and this
>
>to the code and run it again.
>
>(I'm a bit surprised, btw, that
>    rootElement = newdocument.createElement("2003 Course Outline")
>works, since that's not a legal element name in XML.  You'll have
>trouble trying to parse this file.)
>
The OP might want to use element names for document parts and put
the space-containing target stuff, stripped, as associated attribute values
with an appropriate attribute name. Is that too much hinting beyond yours? ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list