Not found in the documentation

Thu Apr 29 02:54:43 EDT 2021

Le mercredi 28 avril 2021 à 17:36:32 UTC+2, Chris Angelico a écrit :

> In what sense of the word "token" are you asking? The parser? You can 
> play around with the low-level tokenizer with the aptly-named 
> tokenizer module. 

It was a good suggestion, and the PLR doesn't mention the tokeniser module. It should, this goes very well with the conversional style it has.

# --------------
from tokenize import tokenize
from io import BytesIO

s="""42 not\
 in [42]"""
g = tokenize(BytesIO(s.encode('utf-8')).readline)
print(*(g), sep='\n')
# --------------

outputting:

......................................
TokenInfo(type=57 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line='')
TokenInfo(type=2 (NUMBER), string='42', start=(1, 0), end=(1, 2), line='42 not in [42]')
TokenInfo(type=1 (NAME), string='not', start=(1, 3), end=(1, 6), line='42 not in [42]')
TokenInfo(type=1 (NAME), string='in', start=(1, 7), end=(1, 9), line='42 not in [42]')
TokenInfo(type=53 (OP), string='[', start=(1, 10), end=(1, 11), line='42 not in [42]')
TokenInfo(type=2 (NUMBER), string='42', start=(1, 11), end=(1, 13), line='42 not in [42]')
TokenInfo(type=53 (OP), string=']', start=(1, 13), end=(1, 14), line='42 not in [42]')
TokenInfo(type=4 (NEWLINE), string='', start=(1, 14), end=(1, 15), line='')
TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')
......................................

So "not in" is not a token and the docs are wrong when they say:

......................................
using a backslash). A backslash is illegal elsewhere on a line outside a string literal.
......................................