[Python-Dev] talking about performance...

Andrew Kuchling akuchlin@mems-exchange.org
Sun, 18 Jun 2000 15:20:20 -0400


On Sun, Jun 18, 2000 at 09:06:45PM +0200, Fredrik Lundh wrote:
>so in other words, something in unicode land isn't
>as efficient as it should...

The relevant bit of findstring() in unicodeobject.c:

    if (direction < 0) {
        for (; end >= start; end--)
            if (Py_UNICODE_MATCH(self, end, substring))
                return end;
    } else {
        for (; start <= end; start++)
            if (Py_UNICODE_MATCH(self, start, substring))
                return start;
    }

And...

#define Py_UNICODE_MATCH(string, offset, substring)\
    (!memcmp((string)->str + (offset), (substring)->str,\
             (substring)->length*sizeof(Py_UNICODE)))

Proposed patch:

Index: unicodeobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/unicodeobject.c,v
retrieving revision 2.26
diff -u -r2.26 unicodeobject.c
--- unicodeobject.c	2000/06/14 09:18:32	2.26
+++ unicodeobject.c	2000/06/18 19:18:01
@@ -2168,11 +2168,13 @@
 
     if (direction < 0) {
         for (; end >= start; end--)
-            if (Py_UNICODE_MATCH(self, end, substring))
+            if ( *(self->str + end) == *(substring->str) &&
+		 Py_UNICODE_MATCH(self, end, substring))
                 return end;
     } else {
         for (; start <= end; start++)
-            if (Py_UNICODE_MATCH(self, start, substring))
+            if (*(self->str + start) == *(substring->str) &&
+		Py_UNICODE_MATCH(self, start, substring))
                 return start;
     }
 

--amk