[Tutor] HTML Parsing

Jeff Younker jeff at drinktomi.com
Mon Apr 21 19:35:09 CEST 2008


On Apr 21, 2008, at 6:40 AM, Stephen Nelson-Smith wrote:

> On 4/21/08, Andreas Kostyrka <andreas at kostyrka.org> wrote:
> I want to stick with standard library.
>
> How do you capture <dt> elements?


from xml.etree import ElementTree

document = """
<html>
    <head>
       <title>foo and bar</title>
     </head>
     <body>
        <dt>foo</dt>
        <dt>bar</dt>
     </body>
</html>
"""

dt_elements = ElementTree.XML(document).findall('dt')

-jeff



More information about the Tutor mailing list