[Python-Dev] Support for "wide" Unicode characters
Guido van Rossum
guido@digicool.com
Mon, 02 Jul 2001 11:29:39 -0400
> Greg Ewing wrote:
> >
> > > It so happened that the Unicode support was written to make it very
> > > easy to change the compile-time code unit size
> >
> > What about extension modules that deal with Unicode strings?
> > Will they have to be recompiled too? If so, is there anything
> > to detect an attempt to import an extension module with an
> > incompatible Unicode character width?
>
> That's a good question !
>
> The answer is: yes, extensions which use Unicode will have to
> be recompiled for narrow and wide builds of Python. The question
> is however, how to detect cases where the user imports an
> extension built for narrow Python into a wide build and
> vice versa.
>
> The standard way of looking at the API level won't help. We'd
> need some form of introspection API at the C level... hmm,
> perhaps looking at the sys module will do the trick for us ?!
>
> In any case, this is certainly going to cause trouble one
> of these days...
Here are some alternative ways to deal with this:
(1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
appended to their name in wide mode. This makes any use of a
Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
fail with a link-time error. (Which should cause an ImportError
for shared libraries.)
(2) Ditto but only rename the PyModule_Init function. This is much
less work but more coarse: a module that doesn't use any Unicode
APIs (and I expect these will be a large majority) still would not
be accepted.
(3) Change the interpretation of PYTHON_API_VERSION so that a low bit
of '1' means wide Unicode. Then you only get a warning (followed
by a core dump when actually trying to use Unicode).
I mentioned (1) and (3) in an earlier post.
--Guido van Rossum (home page: http://www.python.org/~guido/)