Fastest way to calculate leading whitespace

Stefan Behnel stefan_ml at behnel.de
Mon May 10 03:25:43 EDT 2010


Stefan Behnel, 10.05.2010 08:54:
> dasacc22, 08.05.2010 19:19:
>> This is a simple question. I'm looking for the fastest way to
>> calculate the leading whitespace (as a string, ie ' ').
>
> Here is an (untested) Cython 0.13 solution:
>
>     from cpython.unicode cimport Py_UNICODE_ISSPACE
>
>     def leading_whitespace(unicode ustring):
>         cdef Py_ssize_t i
>         cdef Py_UNICODE uchar
>
>         for i, uchar in enumerate(ustring):
>             if not Py_UNICODE_ISSPACE(uchar):
>                 return ustring[:i]
>         return ustring
>
> Cython compiles this to the obvious C code, so this should be impossible
> to beat in plain Python code.

... and it is. For a simple string like

     u = u"   abcdefg" + u"fsdf"*20

timeit gives me this for "s=u.lstrip(); u[:-len(s)]":

1000000 loops, best of 3: 0.404 usec per loop

and this for "leading_whitespace(u)":

10000000 loops, best of 3: 0.0901 usec per loop

It's closer for the extreme case of an all whitespace string like " "*60, 
where I get this for the lstrip variant:

1000000 loops, best of 3: 0.277 usec per loop

and this for the Cython code:

10000000 loops, best of 3: 0.177 usec per loop

But I doubt that this is the main use case of the OP.

Stefan




More information about the Python-list mailing list