[Tutor] Surprised that print("a" "b") gives "ab"
Steven D'Aprano
steve at pearwood.info
Sun Mar 6 00:24:04 EST 2016
On Sat, Mar 05, 2016 at 10:25:51PM -0600, boB Stepp wrote:
> I stumbled into finding out that
>
> >>> "a" "b"
> 'ab'
>
> or
>
> >>> print("a""b")
> ab
>
> Note that there is not even a space required between the two strings.
This is called "implicit concatenation", as opposed to *explicit*
concatenation, "a" + "b".
You're not alone in disliking this, many others do too, and not just
in Python either. E.g. it's also controversial in the D language:
https://issues.dlang.org/show_bug.cgi?id=3827
> I find this is surprising to me as "a" "b" is much less readable to me
> than "a" + "b" . And since Python is all about easily readable code,
> why was this feature implemented? Is there a use case where it is
> more desirable to not have a string concatenation operator explicitly
> used?
I don't know that I would agree that Python is ALL about easily readable
code. Surely there are other factors too, even if they are not weighted
as heavily.
There are a few reasons for implicit concatenation. On their own, I
don't think they would be enough to justify the feature, but taken all
together, I think it was enough to justify the feature.
Firstly, I think the feature was copied straight out of C/C++ where it
does have an rationale:
http://stackoverflow.com/questions/2504536/why-allow-concatenation-of-string-literals
In the early days of Python, concatenation of string literals using +
would take place at runtime, which meant that it could be quite
inefficient. Something like this:
s = "ab" + "cd" + "ef" + "gh" + "ij" + "kl"
would involve making, then destroying, ten temporary strings (plus an
eleventh, the final result for s), for a total runtime cost proportional
to the number of characters SQUARED. Using implicit concatenation:
s = "ab" "cd" "ef" "gh" "ij" "kl"
meant that the interpreter could do the work more efficiently at
compile-time, generating a single string.
(These days, the standard CPython interpreter can do the same with the +
operator, although other implementations may not be so smart.)
Of course, if you're writing "ab" "cd" "ef" "gh" "ij" "kl" instead of
the more sensible "abcdefghijkl" then you deserve to be smacked, but
there is a good use for these: mixing quote marks.
Suppose I need to build a string with just one kind of quote mark. Then
I can use the other quote mark as the delimiter:
s = 'the " double quote mark'
What if I need both kinds? I can using escaping:
s = 'the " double quote mark and the \' single quote mark'
but escaping starts to get ugly. I could use *triple* quotes:
s = """the " double quote mark and the ' single quote mark"""
but what happens if I need to include *triple* quotes of *both* kinds in
the same string? Whatever you do, things are getting messy.
But, we can do this:
s = 'the " double quote mark ' "and the ' single quote mark"
Whether you consider that an improvement or not is probably a matter or
your own personal taste.
These days, I think that most people would agree that implicit
concatenation is kept for backwards compatibility, and that if Python
was designed from scratch today, for better or worse it probably
wouldn't have that feature.
--
Steve
More information about the Tutor
mailing list