Monitoring outgoing web requests

Alan Kennedy alanmk at hotmail.com
Sat May 31 09:10:12 EDT 2003


Chris Rennert wrote:

> I am a network administrator and would like to get into using Python for
> scripting programs that I could use on the job.  One project I would
> like to do is monitor all outgoing web requests (port 80) and record
> their destination address in a text file. I don't want someone to feel
> they have to write it for me I just would like to be pointed in the right
> direction.

It sounds to me like you're really looking for full working utilities, with
source code, so that you can both use them for the task at hand, and be able to
understand them at a programming level?

What I would recommend is that you download the source for one or more of the
many open-source web proxies that are available for python, and take a look
through the source for them. Recommendations are

1. WebDebug. The source code for this proxy is available under the GPL license.
Webdebug does most of the things you would required of a basic proxy. Notes:
HTTPS secure connections mot implemented, storage/caching of transferred
resources reasaonably simple to modify, multi-threading supported. 

http://www.cyberclip.com/webdebug/index.html

2. Amit Patel's Web Proxy project. Amit is a student who had various
proxy-related requirements, so he wrote various implementations which have a
variety of features. Notes: Amit has some good notes on the various
implementations he has created, including discussion of techniques. Amit put a
lot of work in transforming content as it passed through the proxy: e.g.
eliminating javascript that popped-up windows, eliminating requests to
ad-servers, etc.

http://theory.stanford.edu/~amitp/proxy.html

3. Dmitry Rozmanov's proxy is important in that it is the only python proxy I
have seen that implements the proprietary Microsoft NTLM proxy authorisation
directives. Also, Dmitry has a very interesting design in that he has
implemented a HTTP server from the ground up, rather than building on much of
the support already available in the Python standard library (which he was
obviously unhappy with, and I can see why).

http://www.geocities.com/rozmanov/ntlm/

4. Matt Gushee's HttpProxy.

This is a proxy that Matt seems to have coded to solve his need for recording
the resources, for debugging purposes, being transferred back and forth. Might
be the most directly related to your requirement of tracking destinations.

http://www.havenrock.com/pub/tools/httprobe-1.0a1.tar.gz

These four are just a selection of the wide range of HTTP proxies that are
available for python. You can probably find quite a few more through Google and
in the Vaults of Parnassus. Hopefully other people will reply to this message
with other python open-source proxies that they know of or like.

While it likely that no individual proxy will support all of the features that
you might require, I think you will rapidly understand how these proxies work,
and will be writing your own or modifying someone else's in no time at all.

One of these days I'm going to gather a list of python proxies into a single
page, with a feature comparison matrix.

Good luck, 
--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan




More information about the Python-list mailing list