urllib.unquote and unicode
Walter Dörwald
walter at livinglogic.de
Thu Dec 21 07:12:08 EST 2006
Martin v. Löwis wrote:
> Duncan Booth schrieb:
>> The way that uri encoding is supposed to work is that first the input
>> string in unicode is encoded to UTF-8 and then each byte which is not in
>> the permitted range for characters is encoded as % followed by two hex
>> characters.
>
> Can you back up this claim ("is supposed to work") by reference to
> a specification (ideally, chapter and verse)?
>
> In URIs, it is entirely unspecified what the encoding is of non-ASCII
> characters, and whether % escapes denote characters in the first place.
http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1
Servus,
Walter
More information about the Python-list
mailing list