[Tutor] Surprised that print("a" "b") gives "ab"

Steven D'Aprano steve at pearwood.info
Sun Mar 6 00:24:04 EST 2016


On Sat, Mar 05, 2016 at 10:25:51PM -0600, boB Stepp wrote:
> I stumbled into finding out that
> 
> >>> "a" "b"
> 'ab'
> 
> or
> 
> >>> print("a""b")
> ab
> 
> Note that there is not even a space required between the two strings.

This is called "implicit concatenation", as opposed to *explicit* 
concatenation, "a" + "b".
 
You're not alone in disliking this, many others do too, and not just 
in Python either. E.g. it's also controversial in the D language:

https://issues.dlang.org/show_bug.cgi?id=3827



> I find this is surprising to me as "a" "b" is much less readable to me
> than "a" + "b" .  And since Python is all about easily readable code,
> why was this feature implemented?  Is there a use case where it is
> more desirable to not have a string concatenation operator explicitly
> used?  

I don't know that I would agree that Python is ALL about easily readable 
code. Surely there are other factors too, even if they are not weighted 
as heavily.

There are a few reasons for implicit concatenation. On their own, I 
don't think they would be enough to justify the feature, but taken all 
together, I think it was enough to justify the feature.

Firstly, I think the feature was copied straight out of C/C++ where it 
does have an rationale:

http://stackoverflow.com/questions/2504536/why-allow-concatenation-of-string-literals

In the early days of Python, concatenation of string literals using + 
would take place at runtime, which meant that it could be quite 
inefficient. Something like this:

s = "ab" + "cd" + "ef" + "gh" + "ij" + "kl"

would involve making, then destroying, ten temporary strings (plus an 
eleventh, the final result for s), for a total runtime cost proportional 
to the number of characters SQUARED. Using implicit concatenation:

s = "ab" "cd" "ef" "gh" "ij" "kl"

meant that the interpreter could do the work more efficiently at 
compile-time, generating a single string.

(These days, the standard CPython interpreter can do the same with the + 
operator, although other implementations may not be so smart.)

Of course, if you're writing "ab" "cd" "ef" "gh" "ij" "kl" instead of 
the more sensible "abcdefghijkl" then you deserve to be smacked, but 
there is a good use for these: mixing quote marks.

Suppose I need to build a string with just one kind of quote mark. Then 
I can use the other quote mark as the delimiter:

s = 'the " double quote mark'

What if I need both kinds? I can using escaping:

s = 'the " double quote mark and the \' single quote mark'

but escaping starts to get ugly. I could use *triple* quotes:

s = """the " double quote mark and the ' single quote mark"""

but what happens if I need to include *triple* quotes of *both* kinds in 
the same string? Whatever you do, things are getting messy.

But, we can do this:

s = 'the " double quote mark ' "and the ' single quote mark"


Whether you consider that an improvement or not is probably a matter or 
your own personal taste.


These days, I think that most people would agree that implicit 
concatenation is kept for backwards compatibility, and that if Python 
was designed from scratch today, for better or worse it probably 
wouldn't have that feature.



-- 
Steve


More information about the Tutor mailing list