[Numpy-discussion] String sort

Charles R Harris charlesr.harris at gmail.com
Mon Feb 11 10:08:49 EST 2008


On Feb 11, 2008 4:06 AM, Francesc Altet <faltet at carabos.com> wrote:

> A Monday 11 February 2008, Francesc Altet escrigué:
> > A Monday 11 February 2008, Charles R Harris escrigué:
>
> Mmm, comparing my new strncmp and the one that you have implemented in
> SVN, I've found a difference that can account for part of the
> difference in performances.  With your version of strncmp in SVN
> (compare_string), these are my timings with the Opteron server:
>
<snip>

>
> In [17]: np.random.seed(1)
>
> In [18]: a = np.random.rand(10000).astype('S8')
>
> In [19]: %timeit a.copy().sort()
> 100 loops, best of 3: 3.86 ms per loop
>
> In [20]: %timeit newqsort(a.copy())
> 100 loops, best of 3: 3.44 ms per loop
>
> which gives times a 5% worse.  Try to use my version and tell me if it
> does better:
>
> static int inline
> opt_strncmp(char *a, char *b, size_t n)
> {
>    size_t i;
>    unsigned char c, d;
>    for (i = 0; i < n; i++) {
>        c = a[i]; d = b[i];
>        if (c != d) return c - d;
>    }
>    return 0;
> }
>

I didn't notice any speed difference. And while returning the difference of
two unsigned numbers should work with modular arithmetic when it is cast to
integer, I thought the explicit return based on a compare was clearer and
safer. Comparisons always work.

I've attached my working _sortmodule.c.src file so you can fool with these
different changes on your machines also. This is on top of current svn.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080211/2e04f916/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: _sortmodule.c.src
Type: application/x-wais-source
Size: 19053 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080211/2e04f916/attachment.src>


More information about the NumPy-Discussion mailing list