[Numpy-discussion] String sort
Francesc Altet
faltet at carabos.com
Sat Feb 9 08:40:56 EST 2008
A Friday 08 February 2008, Charles R Harris escrigué:
> On Feb 8, 2008 8:58 AM, Francesc Altet <faltet at carabos.com> wrote:
> > A Friday 08 February 2008, Charles R Harris escrigué:
> > > > Also, in the context of my work in indexing, and because of the
> > > > slowness of the current implementation in NumPy, I've ended
> > > > with an implementation of the quicksort method for 1-D array
> > > > strings. For moderately large arrays, it is about 2.5x-3x
> > > > faster than the (supposedly) mergesort version in NumPy, not
> > > > only due to the quicksort, but also because I've implemented a
> > > > couple of macros for efficient string swapping and copy. If
> > > > this is of interest for NumPy developers, tell me and I will
> > > > provide the code.
> > >
> > > I have some code for this too and was going to merge it. Send
> > > yours along and I'll get to it this weekend.
> >
> > Ok, great. I'm attaching it. Tell me if you need some
> > clarification on the code.
>
> I ran a few timing tests. On my machine strncmp is about 100x faster
> than opt_strncmp, but sSWAP (with some fixes), is about 10x faster
> then using the memcpy in a recent compiler. Does this match with your
> experience.
Well, I've run some more exhaustive tests on my laptop (Pentium 4 @ 2
GHz, Ubuntu 7.10, gcc 4.1.3, using -O3 optlevel) with the next sa1
array:
numpy.random.seed(1)
nelem = 10000
a = numpy.random.rand(nelem)
sa1 = a.astype('S16')
And I've chosen the next benchmark:
/* start: the start of data for sa1 array
ss: the length of the string type (16)
num: the number of elements in sa1 (10000)
*/
int sort_S(char *start, int ss, npy_intp num)
{
char *pl = start;
int a = 0;
npy_intp i, j;
for (i=0; i<num; i++)
for (j=0; j<num; j++)
a += opt_strncmp(pl+i*ss, pl+j*ss, ss);
return a;
}
So, when I choose the next implementation of opt_strncmp:
static int inline opt_strncmp1(char *a, char *b, intp n) {
intp i;
for (i=0; i<n; i++) {
if (a[i] > b[i]) return i+1;
if (a[i] < b[i]) return -(i+1);
}
return 0;
}
I get a time of 1.70 s. When using the next implementation:
static int inline opt_strncmp2(char *a, char *b, intp n) {
intp i;
for (i=0; i<n; i++) {
if (a[i] != b[i])
return (((unsigned char *)a)[i] - ((unsigned char *)b)[i]);
}
return 0;
}
I get a time of 1.24 s. The original strncmp gives me 2.05 seconds with
this setup. So, in short and using my laptop:
original strncmp: 2.05 s
opt_stncmp1: 1.70 s ; speed-up: 20%
opt_stncmp2: 1.24 s ; speed-up: 65%
Just in case, I've also run the benchmark in an Opteron server (2 GHz,
SuSe 10.1, gcc 4.2.1, using -O3 optlevel). Here are the results:
original strncmp: 1.25 s
opt_stncmp1: 1.61 s ; speed-up: -30%
opt_stncmp2: 1.03 s ; speed-up: 20%
So, I was wrong: opt_stncmp2 is quite more efficient than opt_stncmp1,
and definitely better than the original strncmp. So, I would say that
using opt_stncmp2 is always the best bet. I'm not certain why are you
seeing that original strncmp is 100x faster than opt_strncmpX :-/
Cheers,
--
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
"-"
More information about the NumPy-Discussion
mailing list