[Python-Dev] Use surrogates for Python import

Victor Stinner victor.stinner at haypocalc.com
Wed Apr 14 02:49:00 CEST 2010


Hi,

Python3 refuses to start with LANG=C if it's installed in a non-ASCII 
directory. I wrote a patch fixing this issue, but it changes a lot of code and 
I would like your opinion.

The main part changes import to use surrogateescape everywhere (find_module, 
load_source_module, null importer, zip importer, etc.). Other files have to be 
adapted:

 - traceback.c (tb_printinternal),
 - ast.c (ast_error_finish),
 - compile.c (compiler_error),
 - bltinmodule.c (builtin_compile),
 - _warnings.c (show_warning),
 - tokenizer.c (fp_setreadl),
 - ...

It fixes also calculate_path(): sys.path may also contains directories using 
surrogates.

Should I continue to work on this, or is it a bad idea to use surrogates in 
filenames used in Python imports?

--

Related issue: #8242. The patch attached to this issue is a work-in-progress, 
I plan to cleanup it and to split it in small patches. It contains extra 
changes to fix many modules and tests if the current directory is not 
decodable.

-- 
Victor Stinner
http://www.haypocalc.com/


More information about the Python-Dev mailing list