Making a socket connection via a proxy server
Alan Kennedy
alanmk at hotmail.com
Fri Jul 30 13:28:09 EDT 2004
[Fuzzyman]
> In a nutshell - the question I'm asking is, how do I make a socket
> conenction go via a proxy server ?
> All our internet traffic has to go through a proxy-server at location
> 'dav-serv:8080' and I need to make a socket connection through it.
> I am hacking "Tiny HTTP Proxy" by SUZUKI Hisao to make an http proxy
> that modifies URLs. I haven't got very far - having started from zero
> knowledge of the 'hyper text transfer protocol'.
>
> It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
> foundation) intercepts all requests to local addresses and then
> re-implements the request (whether it is CONNECT, GET, PUT or
> whatever). It logs everything that goes through it - I will simply
> edit it to amend the URL that is being asked for.
Yes, that is exactly what the proxy should do. It relays requests
between client and server. However, there is one vital detail you're
probably missing that is preventing you from chaining client + proxy*N
+ server together.
When sending a HTTP GET request to a server, a client sends a request
line containing a URI without a server component. This is because the
socket connection to the server is already formed, therefore the
server connection details do not need to be repeated. So a standard
GET will look like this
GET /index.html HTTP/1.1
However, it's different when a client connects to a proxy, because the
socket no longer connects directly to the server, but to the proxy
instead. The proxy still needs to know to which server it should send
the request. So the correct format for sending requests to a proxy is
to use the "absoluteURI" form, which includes the server details, e.g.
GET http://www.python.org:80/index.html HTTP/1.1
Any proxy that receives such a request now knows that the server to
forward to is "www.python.org:80". It will open a connection to
www.python.org:80, and send it a GET request for the URI.
Since you want your proxy to forward to another proxy, i.e. your proxy
is a client from your external-access-proxy's point of view, you
should also use the absoluteURI form when making requests from your
python proxy to your external proxy.
> It looks like the CONNECT and GET requests are just implemented using
> simple socket commands. (I say simple because there isn't a lot of
> code - I'm not familiar with the actual behaviour of sockets, but it
> doesn't look too complicated).
>
> What I need to do is rewrite the soc.connect(host_port) line in the
> following example so that it connects *via* my proxy-server. (which it
> doesn't by default).
>
> I think the current format of host_port is a tuple : (host_domain,
> port_no)
>
> Below is a summary of the GET command (I've inlined all the method
> calls - this example starts from the do_GET method) :
>
> soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> soc.connect(host_port)
What is the value of host_port at this point? It *should* be the
address of your external access proxy, i.e. dav-serv:8080
> soc.send("%s %s %s\r\n" % (
> self.command,
> urlparse.urlunparse(('', '', path, params, query, '')),
> self.request_version))
And you're not sending an absoluteURI: this should be amended to
contain the server details of the the server that is finally going to
service the request. For the python.org example above, this code would be
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse(('http', 'www.python.org:80', path, params,
query, '')),
self.request_version))
though of course, these values should be made available to you by
TinyHTTPProxy. Taking a brief look at the code, these values should
available through the variables "scm" and "netloc". So your outgoing
connection code from TinyHTTPProxy should look something like this
soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect( ('dav-serv', 8080) )
soc.send("%s %s %s\r\n" % (
self.command,
urlparse.urlunparse((scm, netloc, path, params, query, '')),
self.request_version))
HTH,
--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan
More information about the Python-list
mailing list