[Python-Dev] eval and triple quoted strings

Nick Coghlan ncoghlan at gmail.com
Sat Jun 15 07:18:01 CEST 2013


On 15 June 2013 14:08, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>>
>> Not a bug. The same is done for file input -- CRLF is changed to LF before
>> tokenizing.
>
>
> I'm not convinced it's reasonable behaviour to re-scan the
> string as though it's being read from a file. It's a Python
> string, so it's already been through whatever line-ending
> transformation is appropriate to get it into memory.

No, that's not the way the Python compiler works. The transformation
Guido is talking about is the way the tokenizer identifiers "NEWLINE"
tokens:

>>> list(tokenize.tokenize((l for l in (b"""'\r\n'""", b"")).__next__))[2]
TokenInfo(type=4 (NEWLINE), string='\r\n', start=(1, 1), end=(1, 3),
line="'\r\n'")

This long predates universal newlines mode - it's part of the
compilation process, not part of the IO system. The compiler then sees
the NEWLINE token in the tokenizer output, and inserts a "\n" into the
triple-quoted string.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list