[Tutor] Problem using lxml

Anthony Papillion papillion at gmail.com
Sat Aug 22 23:05:53 CEST 2015


Hello Everyone,

I'm pretty new to lxml but I pretty much thought I'd understood the basics.
However, for some reason, my first attempt at using it is failing miserably.

Here's the deal:

I'm parsing specific page on Craigslist (
http://joplin.craigslist.org/search/rea) and trying to retreive the text of
each link on that page. When I do an "inspect element" in Firefox, a sample
anchor link looks like this:

<a href="/reb/5185592209.html" data-id="5185592209" class="hdrlnk">FIRST
OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)</a>

The code I'm using to try to get the link text is this:

from lxml import html
import requests

page = requests.get("http://joplin.craigslist.org/search/rea")
titles = tree.xpath('//a[@title="hdrlnk"]/text()')
print titles

The last line, where it supposedly will print the text of each anchor
returns [].

I can't seem to figure out what I'm doing wrong. lmxml seems pretty
straightforward but I can't seem to get this down.

Can anyone make any suggestions?

Thanks!
Anthony


More information about the Tutor mailing list