pretty strange behavior of "strip"

MRAB google at mrabarnett.plus.com
Fri Dec 5 10:59:10 EST 2008


rdmurray at bitdance.com wrote:
> On Thu, 4 Dec 2008 at 20:54, Terry Reedy wrote:
>>>  'toc.html'
>>> > > >  test[4].strip('.html')
>>>  'oc'
>>>
>>>  Can't figure out what is going on, really.
>>
>> What I can't figure out is why, when people cannot figure out what is 
>> going on with a function (or methods in this case), they do not look 
>> it up the doc. (If you are an exception and did, what confused you?)  
>> Can you enlighten me?
> 
> I'm a little embarrassed to admit this, since I've been using python for
> many years, but until I read these posts I did not understand how strip
> used its string argument, and I _have_ read the docs.  I can't tell you
> what confused the OP, but I can tell you what confused me.
> 
> I have often wished that in 'split' I could specify a _set_ of characters
> on which the string would be split, in the same way the default list
> of whitespace characters causes a split where any one (or more) of
> them appears.  But instead the string argument is a multi-character
> separator.  (Which is sometimes useful and I wouldn't want to lose the
> ability to specify a multi-character separator!)
> 
> My first experience in using the string argument was with split, so when I
> ended up using it with strip, by analogy I assumed that the string passed
> to strip would also be a multi-character string, and thus stripped only
> if the whole string appeared exactly.  Reading the documentation did
> not trigger me reconsider that assumption.  I guess I'm just lucky that
> I haven't run into any bugs (but I think I've used the string argument
> to strip only once or twice in my career).
> 
> It would be lovely if both the split and strip methods would have a
> second string argument that would use the string in the opposite sense
> (as a set for split, as a sequence match for strip).
> 
> In the meantime the docs could be clarified by replacing:
> 
>     the characters in the string will be stripped
> 
> with
> 
>     all occurrences of any of the characters in the string will be
>     stripped
> 
> --RDM
> 
> PS: the OP might want to look at th os.path.splitext function.
 >
If I had thought about it early enough I could have suggested that in 
Python 3 split() and strip() should accept either a string or a set of 
strings. It's still possible to extend split() in the future, but 
changing the behaviour of strip() with a string argument would break 
existing code, something which might have been OK as part of changes in 
Python 3. Unfortunately I don't have access to the time machine! :-)



More information about the Python-list mailing list