how to scrape url out of href
Kent Johnson
kent at kentsjohnson.com
Mon Jan 2 08:59:38 EST 2006
homepricemaps at gmail.com wrote:
> mike's code worked like a charm. i have one more question. i have an
> href which looks like this:
>
> <td class="all">
> <a class="btn" name="D1" href="http://www.cnn.com">
> </a>
>
> i thought i would use this code to get the href out but it fails, gives
> me a keyerror:
>
> for incident in row('td', {'class':'all'}):
> n = incident.findNextSibling('a', {'class': 'btn'})
> link = incident.findNextSibling['href'] + "','"
>
>
> any idea what i'm doing wrong here with the syntax? thanks in advance
>
ISTM that <a class="btn"> is a child of <td>, not a sibling, and
findNextSibling is a method, not an indexable element. Try
n = incident('a', {'class': 'btn'})
link = n['href'] + "','"
Kent
More information about the Python-list
mailing list