[New-bugs-announce] [issue2637] urllib.quote() escapes characters unnecessarily and contrary to docs
Tim Lesher
report at bugs.python.org
Tue Apr 15 17:09:11 CEST 2008
New submission from Tim Lesher <tlesher at gmail.com>:
The urllib.quote docstring implies that it quotes only characters in RFC
2396's "reserved" set.
However, urllib.quote currently escapes all characters except those in
an "always_safe" list, which consists of alphanumerics and three
punctuation characters, "_.-".
This behavior is contrary to the RFC, which defines "unreserved"
characters as alphanumerics plus "mark" characters, or "-_.!~*'()".
The RFC also says:
Unreserved characters can be escaped without changing the semantics
of the URI, but this should not be done unless the URI is being used
in a context that does not allow the unescaped character to appear.
This seems to imply that "always_safe" should correspond to the RFC's
"unreserved" set of "alphanum" | "mark".
----------
components: Library (Lib)
messages: 65518
nosy: tlesher
severity: normal
status: open
title: urllib.quote() escapes characters unnecessarily and contrary to docs
type: behavior
versions: Python 2.5
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2637>
__________________________________
More information about the New-bugs-announce
mailing list