SimplePrograms challenge

Steven Bethard steven.bethard at gmail.com
Tue Jun 12 16:34:41 EDT 2007


Rob Wolfe wrote:
> Steve Howell wrote:
>> Hi, I'm offering a challenge to extend the following
>> page by one good example:
>>
>> http://wiki.python.org/moin/SimplePrograms
> 
> What about simple HTML parsing? As a matter of fact this is not
> language concept, but shows the power of Python standard library.
> Besides, that's very popular problem among newbies. This program
> for example shows all the linked URLs in the HTML document:
> 
> <code>
> from HTMLParser import HTMLParser

[Sorry if this comes twice, it didn't seem to be showing up]

I'd hate to steer a potential new Python developer to a clumsier library 
when Python 2.5 includes ElementTree::

     import xml.etree.ElementTree as etree

     page = '''
     <html><head><title>URLs</title></head>
     <body>
     <ul>
     <li><a href="http://domain1/page1">some page1</a></li>
     <li><a href="http://domain2/page2">some page2</a></li>
     </ul>
     </body></html>
     '''

     tree = etree.fromstring(page)
     for a_node in tree.getiterator('a'):
         url = a_node.get('href')
         if url is not None:
             print url

I know that the wiki page is supposed to be Python 2.4 only, but I'd 
rather have no example than an outdated one.

STeVe



More information about the Python-list mailing list