[Tutor] Extracting xml text

Sun Jun 20 16:04:33 CEST 2010

Thanks all for your help.

I decided to go with iterparse but trying the simple example in the python
interpreter led to an error (see below) and when I tried this with a much
larger xml sample, it seemed to print the full elements, not the specific
values of the element. For example, given what I entered in the python
interpreter, the result would have been the full xml example, and not
"Reminder" "Don't forget me this weekend".

Did I do something wrong in the sample below? Thanks again.

>>> from xml.etree.cElementTree import iterparse
>>> sample = '''\
... <note>
...     <to>Tove</to>
...     <from>Jani</from>
...     <heading>Reminder</heading>
...     <body>Don't forget me this weekend!</body>
... </note>
... '''
>>> print sample
<note>
    <to>Tove</to>
    <from>Jani</from>
    <heading>Reminder</heading>
    <body>Don't forget me this weekend!</body>
</note>

>>> for event, elem in iterparse(sample):
...     if elem.tag == 'note':
...             print elem.findtext('heading'), elem.findtext('body')
...             elem.clear()
...
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 52, in __init__
IOError: [Errno 2] No such file or directory:
"<note>\n\t<to>Tove</to>\n\t<from>Jani</from>\n\t<heading>Reminder</heading>\n\t<body>Don't
forget me this weekend!</body>\n</note>\n"
>>>

On Sun, Jun 20, 2010 at 4:32 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Hi,
>
> please don't top-post, it makes your replies hard to read in context.
>
> Karim, 20.06.2010 10:24:
>
>> On 06/20/2010 10:14 AM, Stefan Behnel wrote:
>>
>>> Use ElementTree's iterparse:
>>>
>>> from xml.etree.cElementTree import iterparse
>>>
>> >> [...]
>
> >
>
>> I know you are promoting Etree and I am very interesting in it.
>> Is there any chance to have it integrated in future standard Python
>> version?
>>
>
> The import above comes directly from the standard library (Python 2.5 and
> later). You may be referring to lxml.etree, which will most likely never
> make it into the stdlib.
>
>
> Stefan
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100620/00dad24e/attachment.html>