[Python-Dev] Indentation oddness...

Sat May 30 02:08:46 CEST 2009

Consider the code:

code = "def  Foo():\n\n    pass\n\n  "

This code is malformed in that the final indentation (2 spaces) does not agree with the previous indentation of the pass statement (4 spaces).  Or maybe it's just fine if you take the blank lines should be ignored statement from the docs to be true.  So let's look at different ways I can consume this code.

If I use compile to compile this:

compile(code, 'foo', 'single')

I get an IndentationError: unindent does not match any outer indentation level

But if I put this in a file:

f= file('indenttest.py', 'w')
f.write(code)
f.close()
import indenttest

It imports just fine.

If I run it through the tokenize module it also tokenizes just fine:

>>> import tokenize
>>> from cStringIO import StringIO
>>> tokenize.tokenize(StringIO(code).readline)
1,0-1,3:        NAME    'def'
1,5-1,8:        NAME    'Foo'
1,8-1,9:        OP      '('
1,9-1,10:       OP      ')'
1,10-1,11:      OP      ':'
1,11-1,12:      NEWLINE '\n'
2,0-2,1:        NL      '\n'
3,0-3,4:        INDENT  '    '
3,4-3,8:        NAME    'pass'
3,8-3,9:        NEWLINE '\n'
4,0-4,1:        NL      '\n'
5,0-5,0:        DEDENT  ''
5,0-5,0:        ENDMARKER       ''

And if it fails anywhere it would seem tokenization is where it should fail - especially given that tokenize.py seems to report this error on other occasions.

And stranger still if I add a new line then it will even compile fine:

compile(code + '\n', 'foo', 'single')

Which seems strange because in either case all of the trailing lines are blank lines and as such should basically be ignored according to the documentation.

Is there some strange reason why compile rejects what everything else agrees is perfectly valid code?