[Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

Michael Urman murman at gmail.com
Tue May 10 15:34:38 CEST 2011


On Tue, May 10, 2011 at 03:03, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> If GetProcAddress() expects a byte string encoded to the ANSI code page,
> my patch is correct because the function used the UTF-8 encoding, not
> the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode
> string. I don't know which encoding is used by GetProcAddressW()...

While I can find references to a GetProcAddressW, most of them seem to
agree it doesn't exist. "My kernel32.dll only exports GetProcAddress."
This suggests to me it accepts a null-terminated bytestring instead of
specifically an ANSI string. What data ends up in the export table is
likely similar to the linux filesystem case, only with less likelihood
of the environment telling you its encoding.

> I already patched _PyImport_GetDynLoadFunc() for Windows: the path is
> now a Unicode object instead of a byte string encoded to the filesystem
> encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and
> LoadLibraryExW(). The work to be fully Unicode compliant (for the path
> field, not for the name) is not completly done... but I have a pending
> patch, see:
> http://bugs.python.org/issue11619
>
> But this patch is huge and creates many functions. I am not sure that we
> need it, I will work on this later.

I'm comfortable with the idea of requiring UTF-8 encoding for the
initmodule entry points of modules named with non-ASCII identifiers,
especially if there is nothing which works consistently today. I've
only seen pure-ASCII library names in all my C++ work, so I feel it
borders on YAGNI (but I like it in theory).

As an alternate approach, one article I read suggested to use ordinals
instead of names if you wanted to use non-ASCII names. Python could
certainly try to load by ordinal on Windows, and fall back to loading
by name. I don't have a clue what the rate of false positives would
be.

-- 
Michael Urman


More information about the Python-Dev mailing list