[Cython] Py_UNICODE* string support

Stefan Behnel stefan_ml at behnel.de
Mon Mar 4 18:58:34 CET 2013


Nikita Nemkin, 04.03.2013 18:39:
> On Mon, 04 Mar 2013 01:56:59 +0600, Stefan Behnel wrote:
>> As one little nit-pick, may I ask you to rename the new name references
>> to "unicode" into "py_unicode" in your code? For example, "is_unicode",
>> "get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is
>> no longer the native equivalent of Python's unicode type in Py3.3, I'd
>> like to avoid confusion in the code. The name "unicode" is much more
>> likely to
>> refer to the builtin Python type than to a native C type when it appears
>> in Cython's sources.
> 
> Actually, "py_unicode" is even more likely to be mistaken for Python-level
> unicode. There are already pairs of methods like
> get_string_const (C-level) + get_py_string_const (Py-level).

Agreed.


> I suggest one of "py_unicode_ptr", "py_unicode_str", "wstring", "wide_string",
> "ustring", "unicode_string" to unambiguously refer to Py_UNICODE* variables
> and constants. Take yout pick.

I think "pyunicode_ptr" or even just "pyunicode" makes it quite clear what
it's about and specifically that "pyunicode" is actually a type name, not a
"py_something". Even "pyunicode_array" would work, although it might
suggest that we know more at compile time than we do, such as the length.

I'll let you choose between these three, although I'm leaning slightly
towards an order of preference as they appear above.


>> Oh, and yet another thing: could you write up some documentation for this
>> in docs/src/tutorial/strings.rst ? Basically a Windows/wchar_t related
>> section, that also warns about the inefficiency in Py3.3, so that users
>> don't accidentally assume it's efficient for anything that needs to be
>> portable.
> 
> Sure, I'm writing the docs now.

Nice.

Stefan



More information about the cython-devel mailing list