[Tutor] HTML Parsing

Andreas Kostyrka andreas at kostyrka.org
Mon Apr 21 20:10:25 CEST 2008


If you have a correct XML document. In practice this is rather a big IF.

Andreas

Am Montag, den 21.04.2008, 10:35 -0700 schrieb Jeff Younker:
> On Apr 21, 2008, at 6:40 AM, Stephen Nelson-Smith wrote:
> 
> > On 4/21/08, Andreas Kostyrka <andreas at kostyrka.org> wrote:
> > I want to stick with standard library.
> >
> > How do you capture <dt> elements?
> 
> 
> from xml.etree import ElementTree
> 
> document = """
> <html>
>     <head>
>        <title>foo and bar</title>
>      </head>
>      <body>
>         <dt>foo</dt>
>         <dt>bar</dt>
>      </body>
> </html>
> """
> 
> dt_elements = ElementTree.XML(document).findall('dt')
> 
> -jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
Url : http://mail.python.org/pipermail/tutor/attachments/20080421/0572719d/attachment.pgp 


More information about the Tutor mailing list