tokenize.untokenize adding line continuation characters

Rotwang sg552 at hotmail.co.uk
Mon Jan 16 17:42:43 EST 2017


Here's something odd I've found with the tokenize module: tokenizing 'if x:\n    y' and then untokenizing the result adds '\\\n' to the end. Attempting to tokenize the result again fails because of the backslash continuation with nothing other than a newline after it. On the other hand, if the original string ends with a newline then it works fine. Can anyone explain why this happens?

I'm using Python 3.4.3 on Windows 8. Copypasted from iPython:


import tokenize, io

tuple(tokenize.tokenize(io.BytesIO('if x:\n    y'.encode()).readline))
Out[2]: 
(TokenInfo(type=56 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line=''),
 TokenInfo(type=1 (NAME), string='if', start=(1, 0), end=(1, 2), line='if x:\n'),
 TokenInfo(type=1 (NAME), string='x', start=(1, 3), end=(1, 4), line='if x:\n'),
 TokenInfo(type=52 (OP), string=':', start=(1, 4), end=(1, 5), line='if x:\n'),
 TokenInfo(type=4 (NEWLINE), string='\n', start=(1, 5), end=(1, 6), line='if x:\n'),
 TokenInfo(type=5 (INDENT), string='    ', start=(2, 0), end=(2, 4), line='    y'),
 TokenInfo(type=1 (NAME), string='y', start=(2, 4), end=(2, 5), line='    y'),
 TokenInfo(type=6 (DEDENT), string='', start=(3, 0), end=(3, 0), line=''),
 TokenInfo(type=0 (ENDMARKER), string='', start=(3, 0), end=(3, 0), line=''))

tokenize.untokenize(_).decode()
Out[3]: 'if x:\n    y\\\n'

tuple(tokenize.tokenize(io.BytesIO(_.encode()).readline))
---------------------------------------------------------------------------
TokenError                                Traceback (most recent call last)
<ipython-input-4-6bd8f83c1114> in <module>()
----> 1 tuple(tokenize.tokenize(io.BytesIO(_.encode()).readline))

C:\Program Files\Python34\lib\tokenize.py in _tokenize(readline, encoding)
    558         else:                                  # continued statement
    559             if not line:
--> 560                 raise TokenError("EOF in multi-line statement", (lnum, 0))
    561             continued = 0
    562 

TokenError: ('EOF in multi-line statement', (3, 0))



More information about the Python-list mailing list