Passing environment variable to HTTP server

William Park parkw at better.net
Thu May 10 01:18:54 EDT 2001


On Tue, May 08, 2001 at 10:18:45PM -0400, Steve Holden wrote:
> "Doug Fort" <dougfort at downright.com> wrote in ...
> > William Park wrote:
> >
> > > When I send a request to CGI script, the browser sends environment
> > > variables, such as HTTP_REFERER, HTTP_USER_AGENT, etc.  How can I
> > > modify the environment variables that are sent to CGI script?
> >
> > To impersonate a browser we send the 'User-agent' header.  However
> > there's usually more to it than that: other custom headers or
> > specialized cookies.  I recommend capturing a browser session with
> > Ethereal http://www.ethereal.com/ and duplicating the headers
> > exactly.
> >
> > Note that even if you are impersonating a browsee, you should adhere
> > to the site's robot.txt file and <meta> tags.  There is an excellent
> > Python module for checking robots.txt.
> 
> Since you already have Python, Sam Rushing's proxy server (found on
> www.nightmare.com) would be an excellent tool for capturing the dialog
> between client and server. Here's a quick example:

Thanks Steve.  I solved it using 'wget --header=...".  The CGI script in
question checks for HTTP_USER_AGENT and HTTP_REFERER environment
variables, and responds differently depending on the values (courtesy of
Microsoft).  Python will be used to parse the HTML file, but there are
other tasks which still beg for shell solution.

--William Park, Open Geometry Consulting, Mississauga, Ontario, Canada.
  8 CPU cluster, (Slackware) Linux, Python, LaTeX, vim, mutt




More information about the Python-list mailing list