strip() using strings instead of chars

Fri Jul 11 13:18:53 EDT 2008

Christoph Zwerschke <cito at online.de> wrote:

> In Python programs, you will quite frequently find code like the
> following for removing a certain prefix from a string:
> 
> if url.startswith('http://'):
>      url = url[7:]

If I came across this code I'd want to know why they weren't using 
urlparse.urlsplit()...

> 
> Similarly for stripping suffixes:
> 
> if filename.endswith('.html'):
>      filename = filename[:-5]

... and I'd want to know why os.path.splitext() wasn't appropriate here.

> 
> My problem with this is that it's cumbersome and error prone to count
> the number of chars of the prefix or suffix. If you want to change it
> from 'http://' to 'https://', you must not forget to change the 7 to 8.
> If you write len('http://')  instead of the 7, you see this is actually
> a DRY problem.
> 
> Things get even worse if you have several prefixes to consider:
> 
> if url.startswith('http://'):
>      url = url[7:]
> elif url.startswith('https://'):
>      url = url[8:]
> 
> You can't take use of url.startswith(('http://', 'https://')) here.
> 
No you can't, so you definitely want to be parsing the URL properly. I 
can't actually think of a use for stripping off the scheme without either 
saving it somewhere or doing further parsing of the url.