[issue3300] urllib.quote and unquote - Unicode issues

Wed Aug 13 17:09:00 CEST 2008

Matt Giuca <matt.giuca at gmail.com> added the comment:

> I'm OK with replace for unquote() ...
> For quote() I think strict is better 

There's just an odd inconsistency there, but it's only a tiny "gotcha";
and I agree with all your other arguments. I'll change unquote back to
errors='replace'.

> This means we have a useful analogy:
> quote(s, e) == quote(s.encode(e)).

That's exactly true, yes.

> Now that you've spent so  much time with this patch, can't you think
> of a faster way of doing this?

Well firstly, you could replace Quoter (the class) with a "quoter"
function, which is nested inside quote. Would calling a nested function
be faster than a method call?

> I wonder if mapping a defaultdict wouldn't work.

That is a good idea. Then, the "function" (as I describe above) would be
just the inside of what currently is the except block, and that would be
the default_factory of the defaultdict. I think that should speed things up.

I'm very hazy about what is faster in the bytecode world of Python, and
wary of making a change and proclaiming "this is faster!" without doing
proper speed tests (which is why I think this optimisation could be
delayed until at least after the core interface changes are made). But
I'll have a go at that change tomorrow.

(I won't be able to work on this for up to 24 hours).

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3300>
_______________________________________