Parsing Rdf (Rewrite)

Jerry Hill malaclypse2 at gmail.com
Thu May 31 09:34:51 EDT 2007


On 5/31/07, Brandon McGinty <brandon.mcginty at gmail.com> wrote:
> I would think that I could do:
> etexts=tree.findall('pgterms:etext')
> (or something like that), Which would pull out each etext record in the
> file.
> I could then do:
> for book in etexts:
>  print book.get('id')
> This isn't yielding anything for me, no matter how I write it.
> Any thoughts on this?

I know very little about ElementTree, but a bit of experimentation
shows that the following seems to work:

import xml.etree.cElementTree as et

tree = et.parse("C:/temp/catalog.rdf")
root = tree.getroot()
etexts = tree.findall("{http://www.gutenberg.org/rdfterms/}etext")
for book in etexts:
    print book.get("{http://www.w3.org/1999/02/22-rdf-syntax-ns#}ID")

I see some comments on namespace issues here:
http://effbot.org/zone/element.htm#xml-namespaces if that helps.

-- 
Jerry



More information about the Python-list mailing list