simple spider in python
Michael Bentley
michael at jedimindworks.com
Thu Aug 23 16:48:50 EDT 2007
On Aug 23, 2007, at 6:33 AM, gmcalendar at gmail.com wrote:
> Hi everybody, i'm new to the forum so: hello everybody (should I say
> "world"?) ^_^
> I'm trying to do a simple spider in python which:
>
> 1) ask google a query
> 2) parse the data
>
> I'm a python newbie so *any* help would be very, very welcommed.
> Thanks in advice!
First thing to know is that google doesn't like the User-agent header
urllib2 uses by default -- you'll have to masquerade as a browser
(google throws a 403 error if you connect as 'User-Agent: Python-
urllib/2.5': look into urllib2.build_opener()). Second thing to know
is that the interesting results have class attribute set to "l".
hope this helps,
Michael
---
Asking a person who he *is* ... is not Pythonic! --Anton Vredegoor
More information about the Python-list
mailing list