[issue36202] Calling Py_DecodeLocale() before _PyPreConfig_Write() can produce mojibake

STINNER Victor report at bugs.python.org
Tue Mar 5 20:22:29 EST 2019


STINNER Victor <vstinner at redhat.com> added the comment:

The "vim" editor embeds Python. It sets the Python home by calling Py_SetPythonHome() with the following code:
---
	    size_t len = mbstowcs(NULL, (char *)p_py3home, 0) + 1;

	    /* The string must not change later, make a copy in static memory. */
	    py_home_buf = (wchar_t *)alloc(len * sizeof(wchar_t));
	    if (py_home_buf != NULL && mbstowcs(
			    py_home_buf, (char *)p_py3home, len) != (size_t)-1)
		Py_SetPythonHome(py_home_buf);
---
ref: https://github.com/vim/vim/blob/14816ad6e58336773443f5ee2e4aa9e384af65d2/src/if_python3.c#L874-L887

mbstowcs() uses the current LC_CTYPE locale. Python can select a different filesystem encoding than the LC_CTYPE encoding depending on PEP 538 and PEP 540. So encoding back the Python home to bytes to access to files on the filesystem can fail because of mojibake.

The code should by written like (pseudo-code):
---
_Py_PreInitialize();

_PyCoreConfig config;
config.home = Py_DecodeLocale(p_py3home);
if (config.home == NULL) { /* ERROR */ }

_PyInitError err = _Py_InitializeFromConfig(&config);
if (_Py_INIT_FAILED(err)) {
    _PyCoreConfig_Clear(&config);
    _Py_ExitInitError(err);
}
---

The vim case has been discussed at:
https://discuss.python.org/t/adding-char-based-apis-for-unix/916/8

----------
nosy: +inada.naoki

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36202>
_______________________________________


More information about the Python-bugs-list mailing list