Match beginning of two strings

Andrew Dalke adalke at mindspring.com
Fri Aug 22 03:28:11 EDT 2003


Mañungo:
> First, I'm not a python programmer... but I think is better test int.
> Something like that:
>
> int *a = (int *) ...first string...;
> int *b = (int *) ...second string...;

Sure.  That's an old trick.  However, your code assumes your
strings are word aligned.  (Perhaps for Python they are but
in general you cannot make that assumption about C char *s)

You also have an off-by-one/off-by-four error.  Suppose the
two strings are "A", that is, an 'A' followed by a NUL.  Then

for (i=0; i<n; i+=4)
  if (*a++ != *b++)
    break

will compare the four bytes and increment the counters.
You then do

char *aa = (char *) a;

and work with the characters *after* the first four characters
tested in the int compare.

Finally, you assume ints are 4 bytes long, which is
not universal.

There's something to be said for simplicity.  I also
wonder if modern optimizing compliers could figure
out some of this automatically, but I don't wonder
enough to find out.

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list