[Jython-checkins] jython: Make urllib.unquote be linear in performance, not quadratic.

jim.baker jython-checkins at python.org
Thu Dec 10 17:25:00 EST 2015


https://hg.python.org/jython/rev/23f2c16b9dc7
changeset:   7825:23f2c16b9dc7
user:        Jim Baker <jim.baker at rackspace.com>
date:        Thu Dec 10 15:24:48 2015 -0700
summary:
  Make urllib.unquote be linear in performance, not quadratic.

Removes in-place string concatenation in the implementation of
urllib.unquote and instead uses a list to buffer writes, then joins
together. Because of mixed unicode/str issues, this implementation
avoids using the higher-level StringIO.StringIO that would manage this
buffering.

See discussion on Jython users mailing list
(http://sourceforge.net/p/jython/mailman/message/34685791/)

files:
  Lib/urllib.py |  17 +++++++++++------
  1 files changed, 11 insertions(+), 6 deletions(-)


diff --git a/Lib/urllib.py b/Lib/urllib.py
--- a/Lib/urllib.py
+++ b/Lib/urllib.py
@@ -1205,15 +1205,20 @@
     # fastpath
     if len(res) == 1:
         return s
-    s = res[0]
+    buf = [res[0]]
+    is_unicode = isinstance(s, unicode)
     for item in res[1:]:
         try:
-            s += _hextochr[item[:2]] + item[2:]
+            if is_unicode:
+                buf.append(unichr(int(item[:2], 16)))
+                buf.append(item[2:])
+            else:
+                buf.append(_hextochr[item[:2]])
+                buf.append(item[2:])
         except KeyError:
-            s += '%' + item
-        except UnicodeDecodeError:
-            s += unichr(int(item[:2], 16)) + item[2:]
-    return s
+            buf.append('%')
+            buf.append(item)
+    return ''.join(buf)
 
 def unquote_plus(s):
     """unquote('%7e/abc+def') -> '~/abc def'"""

-- 
Repository URL: https://hg.python.org/jython


More information about the Jython-checkins mailing list