[Python-Dev] UserString

Alex Martelli aleax at aleax.it
Mon Feb 21 08:06:37 CET 2005


On 2005 Feb 21, at 04:42, Guido van Rossum wrote:

>>>> Oh, bah. That's not what basestring was for. I can't blame you or 
>>>> your
>>>> client, but my *intention* was that basestring would *only* be the
>>>> base of the two *real* built-in string types (str and unicode).
>>
>> I think all this just reinforces the notion that LBYL is
>> a bad idea!
>
> In this case, perhaps; but in general? (And I think there's a
> legitimate desire to sometimes special-case string-like things, e.g.
> consider a function that takes either a stream or a filename
> argument.)
>
> Anyway, can you explain why LBYL is bad?

In the general case, it's bad because of a combination of issues.  It 
may violate "once, and only once!" -- the operations one needs to check 
may basicaly duplicate the operations one then wants to perform.  Apart 
from wasted effort, it may happen that the situation changes between 
the look and the leap (on an external file, or due perhaps to threading 
or other reentrancy).  It's often hard in the look to cover exactly the 
set of prereq's you need for the leap -- e.g. I've often seen code such 
as
     if i < len(foo):
         foo[i] = 24
which breaks for i<-len(foo); the first time this happens the guard's 
changed to 0<=i<len(foo) which then stops the code from working 
w/negative index; finally it stabilizes to the correct check, 
-len(foo)<=i<len(foo), but even then it's just duplicating the same 
check that Python performs again when you then use foo[i]... just 
cluttering code.  The intermediate Pythonista's who's learned to code 
"try: foo[i]=24 // except IndexError: pass" is much better off than the 
one who's still striving to LBYL as he had (e.g.) when using C.

Etc -- this is all very general and generic.

I had convinced myself that strings were a special case worth singling 
out, via isinstance and basestring, just as (say) dictionaries are 
singled out quite differently by metods such as get... I may well have 
been too superficial in this conclusion.

>> Then you would be able to test whether something is sequence-like
>> by the presence of __getitem__ or __iter__ methods, without
>> getting tripped up by strings.
>
> There would be other ways to get out of this dilemma; we could
> introduce a char type, for example. Also, strings might be
> recognizable by other means, e.g. the presence of a lower() method or
> some other characteristic method that doesn't apply to sequence in
> general.

Sure, there would many possibilities.

> (To Alex: leaving transform() out of the string interface seems to me
> the simplest solution.)

I guess you mean translate.  Yes, that would probably be simplest.


Alex



More information about the Python-Dev mailing list