Having trouble with some lists in BeautifulSoup

Alexnb alexnbryan at gmail.com
Wed Jul 16 17:50:05 EDT 2008


Okay, what I want to do with this code is to got to thesaurus.reference.com
and then search for a word and get the syns for it. Now, I can get the syns,
but they are still in html form and some are hyperlinks. But I can't get the
contents out. I am not that familiar with BeautifulSoup. So if anyone wants
to look over this code(if you run it, it will make a lot more sense) and
maybe help me out.

side note: if you run it, a list object will print and what I am after is
the part that starts:

<td colspan="2" widht="100%">american...

Heres the code:

import urllib
from BeautifulSoup import BeautifulSoup

class defSyn:
    def __init__(self, word):
        self.word = word

        def get_syn(term):
            soup =
BeautifulSoup(urllib.urlopen('http://thesaurus.reference.com/search?q=%s' %
term))

            balls = soup.findAll('table', {'width': '100%'})
            print soup.prettify()
            
            
            for tabs in soup.findAll('table', {'width': '100%'}):
                yield tabs.findAll('td', {'colspan': '2'})
                
        self.mainList = list(get_syn(self.word))
        print self.mainList[2]


if You have any further questions I would be happy to answer.
-- 
View this message in context: http://www.nabble.com/Having-trouble-with-some-lists-in-BeautifulSoup-tp18497409p18497409.html
Sent from the Python - python-list mailing list archive at Nabble.com.




More information about the Python-list mailing list