[Web-SIG] REMOTE_ADDR and proxys

Tue Sep 30 00:47:34 CEST 2014

[Alan]
>> I disagreee. I think it is the role of the server/gateway to represent
the
>> actual incoming HTTP request as accurately as possible.

[Robert]
> So I agree with you

OK, so we agree :-)

[Robert]
> but in a multi-tier deployment architecture:

Then why disagree? ;-)

[Robert]
> Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which
> 'request' do app developers need represented? They want the client
> request, which is 3 network hops away: its entirely reasonable (and
> supported by RFC2616 and RFC7230 etc) for the internal structure of
> such a deployment to extend things in such a way that normal
> guarantees are suspended (e.g. caching, source addresses etc).

So what do you include and what do you exclude?

1. It's quite possible that the client is behind som kind of egress proxy
or firewall, which may or may not add a X-Forwarded-For header. Should this
be included?

2. What if your frontend LB is not configured to set an X-Forwarded-For
header? What if it is? What if there is differing configuration across
multiple LBs that are in your ingress path, and you get conflicting results
depending on what path the request came in?

3. What if there is a cache miss on your frontend cache? Will the caching
proxy add a header?

4. What if the proxy added a non-standard X-Forwarded-Ip header?
 - If it does, can you do reverse DNS lookup to find the host that it
reverses to?
 - If yes, in what DNS authority?

5. Is the order in which X-Forwarded-For headers guaranteed? Is it
trustworthy? Will every proxy in the chain declare itself?
 - Answers: no, no, and no.

Each of the above questions has multiple answers, each of which is arguably
valid, depending on your point of view.

The problem is that HTTP proxies are just too easy to write, and every
author of a proxy will make slightly different decisions on what should be
forwarded and what should not. Every configurable proxy can and will be
configured differently, according to the requirements of the folks
operating it.

http://proxies.xhaus.com

[Robert]
> which 'request' do app developers need represented?

The request that arrives into the origin server, exactly as it arrived,
unmodified. That way they can apply their own heuristics to processing the
request, knowing that it has not been interfered with.

> They want the client request, which is 3 network hops away

In your example, it's 3 hops away. I can easily paint you a thousand
different scenarios, each of which is a different number of hops away.

[Robert]
> So it sounds like it should be the responsibility of a middleware to
renormalize the environment?

In order for that to be the case, you have strictly define what
"normalization" means.

I believe that it is not possible to fully specify "normalization", and
that any attempt to do so is futile.

If you want to attempt it for the specific scenarios that your particular
application has to deal with, then by all means code your version of
"normalization" into your application. Or write some middleware to do it.

But trying to make "normalization" a part of a WSGI-style specification is
impossible.

Alan.

On Mon, Sep 29, 2014 at 10:14 PM, Collin Anderson <cmawebsite at gmail.com>
wrote:

> Thanks guys. So it sounds like it should be the responsibility of a
> middleware to re normalize the environment?
>
> On Wed, Sep 24, 2014 at 4:51 PM, Robert Collins <robertc at robertcollins.net
> > wrote:
>
>> On 25 September 2014 07:16, Alan Kennedy <alan at xhaus.com> wrote:
>> > [Collin]
>> >> It seems to me, it is the role of the server/gateway, not the
>> >> application/framework to determine the "correct" client ip address and
>> >> correctly account for the situation of being behind a known proxy.
>> >
>> > I disagreee. I think it is the role of the server/gateway to represent
>> the
>> > actual incoming HTTP request as accurately as possible.
>>
>> So I agree with you, but in a multi-tier deployment architecture:
>>
>> Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which
>> 'request' do app developers need represented? They want the client
>> request, which is 3 network hops away: its entirely reasonable (and
>> supported by RFC2616 and RFC7230 etc) for the internal structure of
>> such a deployment to extend things in such a way that normal
>> guarantees are suspended (e.g. caching, source addresses etc).
>>
>> > If the application knows about remote proxies and local reverse proxies,
>> > then it can take action accordingly.
>> >
>> > But the server should not attempt any magic: it is up to the
>> application to
>> > interpret the request in whatever way it sees fit.
>> ...
>> > If want to the magic rewriting functionality to be isolated from the
>> > application, then it could easily be implemented as middleware.
>>
>> So middleware is an application to the layer above and a server to the
>> layer below: how then is that not the server taking care of the
>> rewriting? Perhaps we're stuck on a definitional thing where by server
>> you are thinking only the code implied by e.g. serve_forever ?
>>
>> -Rob
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20140929/000dcb63/attachment.html>