[Python-Dev] String coercion

Ka-Ping Yee ping@lfw.org
Mon, 10 Jul 2000 02:23:55 -0700 (PDT)


On Sat, 8 Jul 2000, Paul Prescod wrote:
> Ping says:
> > As it stands, with both 8-bit strings and Unicode strings, i think
> > this would result in too much hidden complexity -- thinking about this
> > can wait until we have only one string type. 
> 
> I don't see any hidden complexity. Python has features specifically
> designed to allow this sort of thing on a per-type basis.

Well... what happens when the other operand is a Unicode string?
Do you also do automatic coercion when adding something to a Unicode
string?  When you add one to an arbitrary object, how do you convert
the other object into a Unicode string?  When you add an 8-bit string
and Unicode together, what do you get?

It's not that i don't think you might be able to come up with
consistent rules.  I just suspect that when you do, it might
amount to more hidden stuff than i'm comfortable with.

Of course you could also just use Itpl.py :) or a built-in version
of same (Am i half-serious?  Half-kidding?  Well, let's just throw
it out there...).

Instead of:

    print PaulsString("abcdef")+5
    print open+PaulsString("ghijkl")

with Itpl.py it would just be:

    printpl("abcdef${5}")
    printpl("${open}ghijkl")
    
A built-in Itpl-like operator might almost be justifiable, actually...
i mean, we already have

    "%(open)sghijkl" % globals()

Well, i don't know.  Perhaps it looks too frighteningly like Perl.
Anyway, the rules as implemented (see http://www.lfw.org/python/
for the actual Itpl.py module) are:

    1. $$ becomes $
    2. ${ } around any expression evaluates the expression
    3. $ followed by identifier, followed by zero or more of:
           a. .identifier
           b. [identifier]
           c. (identifier)
       evaluates the expression

What i'm getting at with this approach is that you are clear
from the start that the goal is a string: you have this string
thing, and you're going to insert some stringified expressions
and objects into it.  I think it's clearer & less error-prone for
interpolation to be its own operation, rather than overloading +.
It also means you could start with a Unicode string with $s in 
it, and you would be assured of ending up with a Unicode string,
for example.


-- ?!ng