feature request: a better str.endswith

Raymond Hettinger vze4rx4y at verizon.net
Sat Jul 19 19:23:25 EDT 2003


[Michele Simionato]
> > >>I often feel the need to extend  the string method ".endswith" to tuple
> > >>arguments, in such a way to automatically check for multiple endings.
> > >>For instance, here is a typical use case:
> > >>
> > >>if filename.endswith(('.jpg','.jpeg','.gif','.png')):
> > >>    print "This is a valid image file"

[Jp]
> > >     extensions = ('.jpg', '.jpeg', '.gif', '.png')
> > >     if filter(filename.endswith, extensions):
> > >         print "This is a valid image file
> > >
> > >   Jp


[Irmen]
> > Using filter Michele's original statement becomes:
> >
> > if filter(filename.endswith, ('.jpg','.jpeg','.gif','.png')):
> >      print "This is a valid image file"
> >
> > IMHO this is simple enough to not require a change to the
> > .endswith method...

[Michele]
> I haven't thought of "filter". It is true, it works, but is it really
> readable? I had to think to understand what it is doing.
> My (implicit) rationale for
>
> filename.endswith(('.jpg','.jpeg','.gif','.png'))
>
> was that it works exactly as "isinstance", so it is quite
> obvious what it is doing. I am asking just for a convenience,
> which has already a precedent in the language and respects
> the Principle of Least Surprise.

I prefer that this feature not be added.  Convenience functions
like this one rarely pay for themselves because:

   -- The use case is not that common (afterall, endswith() isn't even
       used that often).

   -- It complicates the heck out of the C code

   -- Checking for optional arguments results in a slight slowdown
       for the normal case.

   -- It is easy to implement a readable version in only two or three
       lines of pure python.

   -- It is harder to read because it requires background knowledge
      of how endswith() handles a tuple (quick, does it take any
      iterable or just a tuple, how about a subclass of tuple; is it
      like min() and max() in that it *args works just as well as
      argtuple; which python version implemented it, etc).

  -- It is a pain to keep the language consistent.  Change endswith()
      and you should change startswith().  Change the string object and
      you should also change the unicode object and UserString and
      perhaps mmap.  Update the docs for each and add test cases for
      each (including weird cases with zero-length tuples and such).

  -- The use case above encroaches on scanning patterns that are
      already efficiently implemented by the re module.

  -- Worst of all, it increases the sum total of python language to be
      learned without providing much in return.

  -- In general, the language can be kept more compact, efficient, and
     maintainable by not trying to vectorize everything (the recent addition
     of the __builtin__.sum() is a rare exception that is worth it).  It is
     better to use a general purpose vectorizing function (like map, filter,
     or reduce).  This particular case is best implemented in terms of the
     some() predicate documented in the examples for the new itertools module
     (though any() might have been a better name for it):

         some(filename.endswith, ('.jpg','.jpeg','.gif','.png'))

     The implementation of some() is better than the filter version because
     it provides an "early-out" upon the first successful hit.


Raymond Hettinger


















More information about the Python-list mailing list