Extract all words that begin with x

Terry Reedy tjreedy at udel.edu
Tue May 11 22:21:08 EDT 2010


On 5/11/2010 6:01 PM, Bryan wrote:
> Tycho Andersen wrote:
>> Terry Reedy wrote:
>>>   ... word[0:1] does the same thing. All Python programmers should learn to
>>> use slicing to extract a  char from a string that might be empty.
>>> The method call of .startswith() will be slower, I am sure.
>>
>> Why? Isn't slicing just sugar for a method call?
>
> Yes, but finding the method doesn't require looking it up by name at
> run-time, and startswith is built to work for startings of any length.
>
> Let's timeit:
>
> # -----
> from timeit import Timer
> from random import choice
> from string import ascii_lowercase as letters
>
> strs = [''.join([choice(letters) for _ in range(5)])
>          for _ in range(5000)]
>
> way1 = "[s for s in strs if s.startswith('a')]"
> way2 = "[s for s in strs if s[:1] == 'a']"
>
> assert eval(way1) == eval(way2)
>
> for way in [way1, way2]:
>      t = Timer(way, 'from __main__ import strs')
>      print(way, ' took: ', t.timeit(1000))
>
> # -----
>
> On my particular box, I get:
>
> [s for s in strs if s.startswith('a')]  took:  5.43566498797
> [s for s in strs if s[:1] == 'a']  took:  3.20704924968
>
> So Terry Reedy was right: startswith() is slower. I would,
> nevertheless, use startswith(). Later, if users want my program to run
> faster and my profiling shows a lot of the run-time is spent finding
> words that start with 'a', I might switch.

Thank you for that timing report.

My main point is that there are two ways to fetch a char, the difference 
being the error return -- exception IndexError versus error value ''. 
This is an example of out-of-band versus in-band error/exception 
signaling, which programmers, especially of Python, should understand.

The fact that in Python syntax tends to be faster than calls was 
secondary, though good to know on occasion.

.startswith and .endswith are methods that wrap the special cases of 
slice at an end and compare to one value. There are not necessary, and 
save no keystrokes, but Guido obviously thought they added enough to 
more than balance the slight expansion of the language. They were added 
after I learned Python and I thought the tradeoff to be a toss-up, but I 
will consider using the methods when writing didactic code meant to be 
read by others.

Terry Jan Reedy




More information about the Python-list mailing list