python screen scraping/parsing

Paul Boddie paul at boddie.org.uk
Fri Jun 13 17:31:54 EDT 2008


On 13 Jun, 23:09, "bruce" <bedoug... at earthlink.net> wrote:
>
> Thanks for the reply. Came to the same conclusion a few minutes before I saw
> your email.
>
> Another question:
>
> tr=d.xpath(foo)
>
> gets me an array of nodes.
>
> is there a way for me to then iterate through the node tr[x] to see if a
> child node exists???

You can always use the DOM or perform another XPath query:

  for node in tr[x].childNodes:
    <do something with node>

  for node in tr[x].xpath(some_other_query_inside_tr):
    <do something with node>

> "d" is a document object, while "tr" would be a node object?, or would i
> convert the "tr[x]" to a string, and then feed that into the
> libxml2dom.parseString()...

There's no need to parse anything again: just use the methods on the
object that tr[x] produces, including the xpath method, of course.
Remember that the document object is just a special node object, so
most of the methods are available on both. If in doubt, run your
program using Python's -i option and then inspect the objects at the
interactive prompt.

Paul



More information about the Python-list mailing list