new string function suggestion

Mon Jun 13 04:36:33 EDT 2005

Andy wrote:
> What do people think of this?
> 
> 'prefixed string'.lchop('prefix') == 'ed string'
> 'string with suffix'.rchop('suffix') == 'string with '
> 'prefix and suffix.chop('prefix', 'suffix') == ' and '
> 
> The names are analogous to strip, rstrip, and lstrip.  But the functionality 
> is basically this:
> 
> def lchop(self, prefix):
>   assert self.startswith(prefix)
>   return self[len(prefix):]
> 
> def rchop(self, suffix):
>   assert self.endswith(suffix)
>   return self[:-len(suffix]
> 
> def chop(self, prefix, suffix):
>   assert self.startswith(prefix)
>   assert self.endswith(suffix)
>   return self[len(prefix):-len(suffix]
> 
> The assert can be a raise of an appropriate exception instead.  I find this 
> to be a very common need,

I'm not sure whether I should be surprised or not. I've never felt the 
need for such a gadget. I don't even recall seeing such a gadget in 
other languages. One normally either maintains a cursor (index or 
pointer) without chopping up the original text, or splits the whole text 
up into tokens. AFAICT, most simple needs in Python are satisfied by 
str.split or re.split. For the special case of file paths, see 
os.path.split*. There are also various 3rd party parsing modules -- look 
in PyPI.

> and often newbies assume that the 
> strip/lstrip/rstrip family behaves like this, but of course they don't.
> 
> I get tired of writing stuff like:
> 
> if path.startswith('html/'):
>   path = path[len('html/'):]
> elif s.startswith('text/'):
>   path = path[len('text/'):]
> 

So create a function (example below) and put it along with others in a 
module called (say) andyutils.py ...

def chop_known_prefixes(path, prefixes):
     for prefix in prefixes:
         if path.startswith(prefix):
             return path[len(prefix):]
     return path

By the way, what do you do if path doesn't start with one of the "known" 
  prefixes?

> It just gets tedious, and there is duplication.  Instead I could just write:
> 
> try:
>   path = path.lchop('html/')
>   path = path.lchop('text/')
> except SomeException:
>   pass
> 

In the event that path contains (say) 'html/text/blahblah...', this 
produces 'blahblah...'; the original tedious stuff produces 
'text/blahblah...'.

> Does anyone else find this to be a common need?  Has this been suggested 
> before?

You can answer that last question yourself by googling comp.lang.python ...