Having trouble with some lists in BeautifulSoup

John Nagle nagle at animats.com
Fri Jul 18 19:29:43 EDT 2008


Alexnb wrote:
> Okay, what I want to do with this code is to got to thesaurus.reference.com
> and then search for a word and get the syns for it. Now, I can get the syns,
> but they are still in html form and some are hyperlinks. But I can't get the
> contents out. I am not that familiar with BeautifulSoup. So if anyone wants
> to look over this code(if you run it, it will make a lot more sense) and
> maybe help me out.

    The thesaurus site may become annoyed if you overdo this.

    However, it's not hard to do.  Search the output for
an "a" tag with class "noline", then extract the text content
of the "a" tag.  The BeautifulSoup manual will tell you how.

    If you want raw thesaurus data you can use freely, see 
"http://wordnet.princeton.edu".

					John Nagle



More information about the Python-list mailing list