Beautiful Soup Looping Extraction Question

Stefan Behnel stefan_ml at behnel.de
Tue Mar 25 14:50:12 EDT 2008


Hi,

again, not BS related, but still a solution.

Tess wrote:
> Let's say I have a file that looks at file.html pasted below.
> 
> My goal is to extract all elements where the following is true: <p
> align="left"> and <div align="center">.

Using lxml:

  from lxml import html
  tree = html.parse("file.html")
  for el in tree.iter():
      if el.tag == 'p' and el.get('align') == 'left':
          print el.tag
      elif el.tag == 'div' and el.get('align') == 'center':
          print el.tag

I assume that BS can do something similar, though.

Stefan



More information about the Python-list mailing list