Extract information from HTML table

Dotan Cohen dotancohen at gmail.com
Sun Apr 1 15:54:35 EDT 2007


On 1 Apr 2007 07:56:04 -0700, Ulysse <maxime.p at gmail.com> wrote:
> I have seen the Beautiful Soup online help and tried to apply that to
> my problem. But it seems to be a little bit hard. I will rather try to
> do this with regular expressions...
>

If you think that Beautiful Soup is difficult than wait till you try
to do this with regexes. Granted you know the exact format of the HTML
you are scraping will help, if you ever need to parse HTML from an
unknown source than Beautiful Soup is the only way to go. Not all HTML
authors close their td and tr tags, and sometimes there are attributes
to those tags. If you plan on ever reusing the code or the format of
the HTML may change, then you are best off sticking with Beautiful
Soup.

Dotan Cohen

http://lyricslist.com/
http://what-is-what.com/



More information about the Python-list mailing list