[Python-ideas] Implicit string literal concatenation considered harmful?

Gregory P. Smith greg at krypto.org
Tue May 14 18:36:54 CEST 2013


On Sat, May 11, 2013 at 3:19 PM, Ron Adam <ron3200 at gmail.com> wrote:

>
> Greg, I meant to send my reply earlier to the list.
>
>
>
> On 05/11/2013 12:39 AM, Greg Ewing wrote:
>
>> Also, doesn't this imply that ... is now an operator in some contexts,
>>>
>>  > but a literal in others?
>>
>
> Could it's use as a literal be depreciated?  I haven't seen it used in
> that except in examples.
>
>
>
>  It would have different meanings in different contexts, yes.
>>
>> But I wouldn't think of it as an operator, more as a token
>> indicating string continuation, in the same way that the
>> backslash indicates line continuation.
>>
>
> Yep, it would be a token that the tokenizer would handle.  So it would be
> handled before anything else just as the line continuation '\' is.   After
> the file is tokenized, it is removed and won't interfere with anything else.
>
> It could be limited to strings, or expanded to include numbers and
> possibly other literals.
>
>     a = "a long text line "...
>         "that is continued "...
>         "on several lines."
>
>     pi =  3.1415926535...
>             8979323846...
>             2643383279
>
> You can't do this with a line continuation '\'.
>
>
> Another option would be to have dedented multi-line string tokens |""" and
> |'''.   Not too different than r""" or b""".
>
>     s = |"""Multi line string
>         |
>         |paragraph 1
>         |
>         |paragraph 2
>         |"""
>
>     a = |"""\
>         |a long text line \
>         |that is continued \
>         |on several lines.\
>         |"""
>
> The rule for this is, for strings that start with |""" or |''', each
> following line needs to be proceeded with whitespace + '|', until the
> closing quote is reached.  The tokenizer would just find and remove them as
> it comes across them.  Any '|' on a line after the first '|' would be
> unaffected, so they don't need to be escaped.
>
>
+1 to adding something like that.  i loathe code that uses textwrap.dedent
on constants.  poor memory and runtime overhead.

I was just writing up a response to suggest adding auto-detended multi-line
strings to take care of one of the major use cases.  I went with a naive
d""" approach but I also like your | idea here.  though it might cause too
many people to want to line up the opening | and the following |s (which
isn't necessary at all and is actively harmful for code style if it forces
tedious reindentation when refactoring code that alters the length of the
lhs before the opening |""")

-gps


> IT's a very explicit syntax. It's very obvious what is part of the string
> and what isn't.  Something like this would end the endless debate on
> dedents.  That alone might be worth it.   ;-)
>
> I know the | is also a binary 'or' operator, but it's use for that is in a
> different contex, so I don't think it would be a problem.
>
> Both of these options would be implemented in the tokenizer and are really
> just tools to formatting source code rather than actual additions or
> changes to the language.
>
> Cheers,
>    Ron
>
>
>
>
>
>
>
>
>
>
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130514/4e04da12/attachment.html>


More information about the Python-ideas mailing list