[Tutor] tokenize problem on string literal?

C or L Smith smiles at worksmail.net
Mon Oct 26 05:56:24 CET 2009


C or L Smith wrote:
> Am I misunderstanding a tokenize parse rule or is this an error:
> 
> ###
> def tok(s):
>     import tokenize
>     from StringIO import StringIO
>     t = StringIO(s).readline
>     for ti in tokenize.generate_tokens(t):
>         print ti
> tok("'''quote: \''''")
> ###
> 

Note to self: you are misunderstanding what sort of string is being created
by the above: since this isn't a raw string, the first \' just becomes a literal
quote so it looks like this:

'''quote: ''''

which is what tokenize is telling you.

You have to be careful when using the editor to create examples. When entering
the following, the following colors (yellow for string and black for non-string)
are noted:

foo color    for this
---------    --------
yellow       '''afoo
yellow       '''a\'''foo
black        '''a\''''foo

If you want to quote that now for testing, make it a raw string:

>>> r"'''a\''''foo"
"'''a\\''''foo"

That *bottom* form is what you should be using to test your quote matcher
or tokenize's behavior.

sigh :-)
/c



More information about the Tutor mailing list