[issue10114] compile() doesn't support the PEP 383 (surrogates)

STINNER Victor report at bugs.python.org
Sat Oct 16 01:17:37 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> I do not see what filesystem encodings, or any other encoding 
> to bytes should really have to do with the [code.co_filename].

co_filename attribute is used to display the traceback: Python opens the related file, read the source code line and display it. On Windows, co_filename is directly used because Windows accepts unicode for filenames. But on other OSes, you have to encode the filename to the filesystem encoding.

If your filesystem encoding is 'ascii' (eg. C locale) and co_filename is a non-ascii filename (eg. 'test_é.py'), encode co_filename will raise a UnicodeEncodeError. You can test it simply by using os.fsencode():

$ ./python 
Python 3.2a3+ (py3k:85551:85553M, Oct 16 2010, 00:54:03) 
>>> import sys; sys.getfilesystemencoding()
'utf-8'
>>> import os; os.fsencode('é')
b'\xc3\xa9'

$ LANG= ./python 
Python 3.2a3+ (py3k:85551:85553M, Oct 16 2010, 00:54:03) 
>>> import sys; sys.getfilesystemencoding()
'ascii'
>>> import os; os.fsencode('\xe9')
...
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' ...

Said differently, co_filename should be encodable to the filesystem encoding (os.fsencode(co_filename) should not raise an error).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10114>
_______________________________________


More information about the Python-bugs-list mailing list