[issue27716] http.client truncates UTF-8 encoded headers

Sat Sep 17 21:57:01 EDT 2016

Martin Panter added the comment:

Thanks to the fix for Issue 22233, now the response is parsed more sensibly, and the body can be read. The 0x85 byte now gets decoded with Latin-1:

>>> print(ascii(resp.getheader("Link")[:100]))
'<http://www.babla.cn/\xe8\x8b\xb1\xe8\xaf\xad-\xe6\xb3\xa2\xe5\x85\xb0\xe8\xaf\xad/>; rel="alternate"; hreflang="zh-Hans", <http://cs.bab.la/slov'

Here is a patch to document how to get the original bytes back (by “encoding” to Latin-1). Other than that, I don’t think there is much left to do for this bug.

----------
keywords: +patch
Added file: http://bugs.python.org/file44733/header-decoding.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27716>
_______________________________________