SimplePrograms challenge

Steven Bethard steven.bethard at gmail.com
Thu Jun 14 00:00:43 EDT 2007


Rob Wolfe wrote:
> # HTML page
> dinner_recipe = '''
> <html><head><title>Recipe</title></head><body>
> <table>
> <tr><th>amt</th><th>unit</th><th>item</th></tr>
> <tr><td>24</td><td>slices</td><td>baguette</td></tr>
> <tr><td>2+</td><td>tbsp</td><td>olive_oil</td></tr>
> <tr><td>1</td><td>cup</td><td>tomatoes</td></tr>
> <tr><td>1-2</td><td>tbsp</td><td>garlic</td></tr>
> <tr><td>1/2</td><td>cup</td><td>Parmesan</td></tr>
> <tr><td>1</td><td>jar</td><td>pesto</td></tr>
> </table>
> </body></html>'''
> 
> # program
> import xml.etree.ElementTree as etree
> tree = etree.fromstring(dinner_recipe)
> 
> #import ElementSoup as etree                 # for invalid HTML
> #from cStringIO import StringIO              # use this
> #tree = etree.parse(StringIO(dinner_recipe)) # wrapper for BeautifulSoup
> 
> pantry = set(['olive oil', 'pesto'])
> 
> for ingredient in tree.getiterator('tr'):
>     amt, unit, item = ingredient.getchildren()
>     if item.tag == "td" and item.text not in pantry:
>         print "%s: %s %s" % (item.text, amt.text, unit.text)

I posted a slight variant of this, trimmed down a bit to 21 lines.

STeVe



More information about the Python-list mailing list