[Numpy-discussion] String sort

Francesc Altet faltet at carabos.com
Tue Feb 12 03:58:44 EST 2008


A Monday 11 February 2008, Charles R Harris escrigué:
> On Feb 11, 2008 1:15 PM, Francesc Altet <faltet at carabos.com> wrote:
> > Here are the results of running it in several platforms:
> >
> > 1) My laptop: Ubuntu 7.1 (gcc 4.1.3, Pentium 4 @ 2 GHz)
> > Benchmark with 1000000 strings of size 15
> > C qsort with C style compare: 2.450000
> > C qsort with Python style compare: 2.440000
> > NumPy newqsort: 0.650000
>
> Wow, what a difference.

Yeah.  This is why I got so excited initially.  It's unfortunate that 
most of the speed-up in newqsort in this case is probably due to a 
possible flaw in qsort implementation for the combination 
Ubuntu/Pentium4.  On the positive side, it is nice to see that other 
distros/processors have a decent performance on system qsort ;-)

> > * I'd definitely keep memcpy by default.  From my timings, it looks
> > like the best option for all platforms.
>
> OK. Was that just for the copies, or was it for the swaps also? I ran
> a version of swap using memcpy on my machine and the sort was about
> half as fast for 8 character strings.

No, only for the copies.  For the swaps, this memcpy-based version:

#define swap_string(s1, s2, len) { \
    memcpy((vp), (s2), (len));	    \
    memcpy((s2), (s1), (len));	    \
    memcpy((s1), (vp), (len));	    \
  }

performs extremely bad on my systems:

Pentium4/Ubuntu 7.10:
    * newqsort with the loop version of swap_string:  0.65 s
    * newqsort with the memcpy version of swap_string: 9.14 s

Opteron/SuSe LE 10.3:
    * newqsort with the loop version of swap_string:  0.59 s
    * newqsort with the memcpy version of swap_string: 8.71 s

So, it seems that the nice newqsort performance is extremely dependent 
on the loop version of swap_string.

> > I hope the benchmark will behave well in your platform too (i.e.
> > newqsort will perform the best ;)
>
> I'll check it out when I get home. As I say, it was running about 10%
> slower on my machine, but if it does better on most platforms it is
> probably the way to go. We can always change it in the future when
> everyone is running on quantum computers.

Quantum computers?  Oh, I can't wait for mine ;-)

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"



More information about the NumPy-Discussion mailing list