Web-crawling
John J. Lee
jjl at pobox.com
Sat Oct 4 12:26:31 EDT 2003
"John Bradbury" <john_bradbury at ___cableinet.co.uk> writes:
> "Rene Pijlman" <reply.in.the.newsgroup at my.address.is.invalid> wrote in
> message news:bretnvcng69nqpoeug71jon4obs0moe63f at 4ax.com...
> > John Bradbury:
> > >I am trying to develop a special putpose crawler using htmllib & urllib.
> > >How do you tell the server application that you are a modern browser
> > >and can handle frames?
[...]
> > server would care, but you could mimic the User-agent header sent by a
[...]
> I don't know what is causing the problem, but the site I am accessing is
> sending out forms for a browser that has a low resolution and does not
> support frames. Excuse my ignorance, but where do you set up the
> User-agent header you suggested.
For urllib2 (well, almost):
http://wwwsearch.sourceforge.net/ClientCookie/doc.html#headers
John
More information about the Python-list
mailing list