[issue22928] HTTP header injection in urrlib2/urllib/httplib/http.client

Demian Brecht report at bugs.python.org
Tue Feb 17 06:04:05 CET 2015


Demian Brecht added the comment:

> But it is not natural to do things like this (based on headers sent by Firefox)

Good point.

> Otherwise, retaining the one_value.encode('latin-1') call is confusing when later on it rejects non-ASCII-encoded characters.

I’m a little torn on this one given one of the SHOULD clauses in RFC 7230 about recipients treating headers with non-ASCII characters as opaque data. However, I’ve read a number of occasions where users are using latin-1 in practice (and it /is/ only a SHOULD clause), so I think it’s likely better to err on the side of caution and allow for the latin-1 charset at least.

As for utf-8 though, I think that once we start getting into the realm of other application protocols, that’s something that should have to be extended by the client implementation and not something that should be changed in the base HTTP implementation.

The odd part of the API though now is the fact that it’s variadic. I really have no strong opinion on whether elements should be tab or space delimited and the RFC doesn’t seem to lean either way. I think I’m still leaning towards space delimiting to give users the ability to write in either form (putheader(‘Authorization’, ‘Bearer’, ‘token’) or putheader(‘Authorization’, ‘Bearer token’)). As another minor argument for it, it’s also likely a little nicer for logging. I think that optimally, the API would be a single value as you’d suggested, but I’d be concerned about the extent of backwards compatibility issues if that were to be done.

I’ll try to get some time tomorrow to make those changes, so it still leaves time for further debate :)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22928>
_______________________________________


More information about the Python-bugs-list mailing list