[Tutor] REPL format

Steven D'Aprano steve at pearwood.info
Sun Apr 26 10:38:39 CEST 2015


On Sat, Apr 25, 2015 at 05:38:58PM -0700, Danny Yoo wrote:
> On Sat, Apr 25, 2015 at 4:38 PM, Jim Mooney <cybervigilante at gmail.com> wrote:
> > I'm curious why, when I read and decode a binary file from the net in one
> > fell swoop, the REPL prints it between parentheses, line by line but with
> > no commas, like a defective tuple.
> 
> 
> The REPL is trying to be nice here.  What you're seeing is a
> representation that's using a little-known Python syntactic feature:
> string literals can be spread across lines.  See:
> 
>     https://docs.python.org/2/reference/lexical_analysis.html#string-literal-concatenation
> 
> At the program's parse time, the Python compiler will join adjacent
> string literals automatically.

It's not specifically across lines. Any two adjacent string literals 
will be concatenated, even if they have different string delimiters:

    '"You can' "'t do that!" '" he said.'

is arguable somewhat nicer than either of these alternatives:

    '"You can\'t do that!" he said.'
    "\"You can't do that!\" he said."


(Well, to be completely honest, I don't that *any* of the solutions to 
the problem of using both sorts of quotation marks in the same string is 
nice, but its good to have options.)

Where automatic concatenation of string literals really comes into 
its own is when you have a long template string or error message, for 
example, which won't fit on a single line:

    message = "Doh, a deer, a female deer; Ray, a drop of golden sun; 
Me, a name I call myself; Far, a long long way to run; So, a needle 
pulling thread; La, a note to follow So; Tea, a drink with jam and 
bread."


We could use a triple quoted string:

    message = """Doh, a deer, a female deer;
                 Ray, a drop of golden sun;
                 Me, a name I call myself;
                 Far, a long long way to run;
                 So, a needle pulling thread;
                 La, a note to follow So;
                 Tea, a drink with jam and bread."""

But that not only includes line breaks, but the second line onwards is 
indented. We can break it up into substrings, and using a surrounding 
pair of round brackets to continue the expression over multiple lines, 
we can then use explicit concatenation:

    message = ("Doh, a deer, a female deer;" +
               " Ray, a drop of golden sun;" + 
               " Me, a name I call myself;" +
               " Far, a long long way to run;" +
               " So, a needle pulling thread;" +
               " La, a note to follow So;" +
               " Tea, a drink with jam and bread."
               )

but that has the disadvantage that the concatenation may be performed at 
runtime. (In fact, recent versions of CPython will do that at compile 
time, but it is not guaranteed by the language. Past versions did not; 
future versions may not, and other implementations like PyPy, 
IronPython, Jython and others may not either.)

To my mind, the least unpleasant version is this one, using implicit 
concatenation:

    message = (
        "Doh, a deer, a female deer; Ray, a drop of golden sun;"
        " Me, a name I call myself; Far, a long long way to run;"
        " So, a needle pulling thread; La, a note to follow So;"
        " Tea, a drink with jam and bread."
        )


> It's a cute-but-nasty trick that some other languages do, such as C++.
> 
> 
> I would strongly discourage not using it yourself in your own
> programs: it's the source of a very common mistake.  Here's an
> example:

I agree that this is a feature which can be abused and misused, but I 
disagree with Danny about avoiding it. I think that it works very well 
in the above example.

 
> So that's why I don't like this feature: makes it really hard to catch
> mistakes when one is passing string literals as arguments and forgets
> the comma.  Especially nasty when the function being called uses
> optional arguments.

I acknowledge that this scenario can and does happen in real life, but I 
don't think it is common. It only occurs when you have two consecutive 
arguments which are both string literals, and you forget the comma 
between them. How often does that happen? It cannot occur if one or more 
of the arguments are variables:

    greeting = "Hello "
    function(greeting "world!")  # Syntax error.




-- 
Steve


More information about the Tutor mailing list