[XML-SIG] DOCTYPE problem loading XML file.

Brendon Costa brendon at christian.net
Mon Apr 16 00:27:16 CEST 2007


Thanks that worked great (with a few minor modifications). The resulting
script that achieved it for reference was:

import sys
import amara
import commands

doc = amara.parse(sys.argv[1])
for pl in doc.xml_xpath(u'//programlisting[@id]'):
   if pl.id[:7] == 'script_':
      value = commands.getoutput(unicode(pl.id[7:]))
      pl.xml_clear()
      pl.xml_append(unicode(value))

print doc.xml()








Luis Miguel Morillas wrote:
> 2007/4/14, Brendon Costa <brendon at christian.net>:
>> Hi all,
>>
>> I have a manual i am writing for a project I have been developing in
>> docbook format. This manual contains "programlisting" nodes that show
>> output generated from some scripts.
>>
>> I want to write a small application using python XML libraries that will
>> load this docbook file and for each programlisting node with an id that
>> starts with script_... i want to execute the script ... and replace the
>> programlisting nodes value with the resulting output.
>>
>>
> try this quick example (using amara lib):
> 
> {{{
> import sys
> import cStringIO
> import amara
> doc = amara.parse('doc.xml')
> 
> fout_old = sys.stdout
> sys.stdout = cStringIO.StringIO()
> for pl in doc.xml_xpath(u'//programlisting[@id]'):
>    if pl.id[:7]=='script_':
>        exec(unicode(pl))
>        pl.xml_clear()
>        pl.xml_append_fragment(sys.stdout.getvalue())
> sys.stdout = fout_old
> 
> print doc.xml()
> }}}
> 
> 
> 
>>
>> Firstly does anyone know of an existing tool that could do this for me
>> (I haven't been successful in finding one)?
>>
>>
>>
>>
>> Otherwise i have been trying to create my own tool in python. The first
>> stage which is loading the docbook XML file into python using the DOM
>> parser. This is my first time dealing with python and XML.
>>
>> The code is so far VERY simple:
>>
>> import sys
>> from xml.dom.ext.reader import Sax2
>> reader = Sax2.Reader()
>> doc = reader.fromStream(sys.argv[1])
>>
>> Running that using:
>> python update_docbook.py manual.xml
>>
>> fails to load the manual.xml file. The XML file has a DOCTYPE. Now for
>> my needs in modifying the document is don't care about the DOCTYPE, i
>> just want to keep it intact as it is. Is there any way to tell the DOM
>> parser that i don't care about the DOCTYPE?
>>
>>
>> If this is not possible, following are the errors i get trying to load
>> the docbook xml file.
>>
>> Firstly without a DTD available at all:
>> ValueError: unknown url type: docbookx.dtd
>>
>>
>> If i then copy across my DTD data into the current directory (DOCTYPE
>> references a file in the current directory at the moment to avoid having
>> to go to the internet all the time) it seems to find it as i would
>> expect, but there are still other errors:
>> xml.Sax._exceptions.SAXParseException: dbnotnx.mod:60:80: error in
>> processing external entity reference
>>
>> and if i change the doctype back to the correct URL, i get the same
>> error but:
>> xml.Sax._exceptions.SAXParseException:
>> http://www.oasis-open.org/docbook/xml/4.5/dbnotnx.mod:60:80: error in
>> processing external entity reference
>>
>>
>> So how would i go about loading this docbook xml file in python using
>> DOM so i can then manipulate it? Would you recommend that i change to
>> use a Sax parser and if so can it be used to ignore the DOCTYPE?
>>
>>
>> Thanks for any info.
>> Brendon.
>>
>>
>>
>> _______________________________________________
>> XML-SIG maillist  -  XML-SIG at python.org
>> http://mail.python.org/mailman/listinfo/xml-sig
>>
> 
> 



More information about the XML-SIG mailing list