[Python-Dev] str() for interpreter output

Ka-Ping Yee ping@lfw.org
Sun, 9 Apr 2000 03:33:00 -0700 (PDT)


On Sun, 9 Apr 2000, Tim Peters wrote:
> You later say (echoing Donn Cave)
> 
> > repr() is for the human, not for the machine
> 
> but that contradicts the docs and the design.  What you mean <wink> to say
> is "the thing that the interactive prompt uses by default *should* be for
> the human, not for the machine" -- which repr() is not.

No, what i said is what i said.

Let's try this again:

    repr() is not for the machine.

The documentation for __repr__ says:

    __repr__(self)  Called by the repr() built-in function and by
    string conversions (reverse quotes) to compute the "official"
    string representation of an object.  This should normally look
    like a valid Python expression that can be used to recreate an
    object with the same value.

It only suggests that the output "normally look like a valid
Python expression".  It doesn't require it, and certainly doesn't
imply that __repr__ should be the standard way to turn an object
into a platform-independent serialization.

> This is way oversold:  str() also supplies "[" for lists, "(" for tuples,
> "{" for dicts, and "<" for instances of classes that don't override __str__.
> The only difference between repr() and str() in this listing of faux terror
> <wink> is when they're applied to strings.

Right, and that is exactly the one thing that breaks everything:
because strings are the most dangerous things to display raw, they
can appear like anything, and break all the rules in one fell swoop.

> > Granted, repr() cannot always produce an exact reconstruction of an
> > object.  repr() is not a serialization mechanism!
> 
> To the contrary, many classes and types implement repr() for that very
> purpose.  It's not universal but doesn't need to be.

If they want to, that's fine.  In general, however,

    repr() is not for the machine.

If you are using repr(), it's because you are expecting a human to
look at the thing at some point.

> > We have 'pickle' for that.
> 
> pickles are unreadable by humans; that's why repr() is often preferred.

Precisely.  You just said it yourself: repr() is for humans.  That
is why repr() cannot be mandated as a serialization mechanism.

There are two goals at odds here: readability and serialization.
You can't have both, so you must prioritize.  Pickles are more
about serialization than about readability; repr is more about
readability than about serialization.

repr() is the interpreter's way of communicating with the human.
It makes sense that e.g. the repr() of a string that you see
printed by the interpreter looks just like what you would type
in to produce the same string, because the interpreter and the
human should speak and understand the same language as much as
possible.

> >     >>> a = '\\'
> >     >>> b = '\''
> 
> I'd actually like to use euroquotes for str(string) -- don't throw the
> Latin-1 away with your outrage <wink>.

And no, even if you argue that we need to have something else,
whatever you want to call it, it's not called 'str'.  'str' is
"coerce to string".  If you coerce an object into the type it's
already in, it must not change.  So, if x is a string, then
str(x) must == x.

> Whatever, examples with backslashes
> are non-starters, since newbies can't make any sense out of their doubling
> under repr() today either (if it's not a FAQ, it should be -- I've certainly
> had to explain it often enough!).

It may not be easy, but at least it's *consistent*.  Eventually,
you can't avoid the problem of escaping characters, and you just
have to learn how that works, and that's that.  Introducing yet
a different way of escaping things won't help.

Or, to put it another way: to write Python, it is required that
you understand how to read and write escaped strings.  Either
you learn just that, or you learn that plus another, different
way to read escaped-strings-as-printed-by-the-interpreter.  The
second case clearly requires you to learn and remember more.

> Nobody ever promised that eval(str(x)) == x -- if they want that, they
> should use repr() or backticks.  Today they get
> 
> >>> a
> '\\'
> 
> and scream "Huh?! I thought that was only supposed to be ONE backslash!".

You have to understand this at some point.  You can't get around it.
Changing the way the interpreter prints things won't save anyone the
trouble of learning it.

> Or someone in Europe tries to look at a list of strings, or a simple dict
> keyed by names, and gets back a god-awful mish-mash of octal backslash
> escapes (and str() can't be used today to stop that either, since str()
> "isn't passed down").

This is a pretty sensible complaint to me.  I don't use characters
beyond 0x7f often, but i can empathize with the hassle.  As you
suggested, this could be solved by having the built-in container
types do something nicer with str(), such as repr without escaping
characters beyond 0x7f.  (However, characters below 0x20 are definitely
dangerous to the terminal, and would have to be escaped regardless.)

> Not at all.  "Tim's snot-removal algorithm" didn't remove anything
> ("removal" is an adjective I don't believe I've seen applied to it before).

Well, if you "special-case the snot OUT of strings", then you're
removing snot, aren't you?  :)

> What I want *most*, though, is for ssctsoos() to get passed down (from
> container to containee), and for it to be the default action.

Getting it passed down as str() seems okay to me.  Making it
the default action, in my (naturally) subjective opinion, is
Right Out if it means that

    eval(what_the_interpreter_prints_for(x)) == x
    
no longer holds for objects composed of the basic built-in types.


-- ?!ng