[Pythonmac-SIG] CF module oddity
Ronald Oussoren
oussoren@cistron.nl
Tue May 6 21:06:43 EDT 2003
On Tuesday, May 6, 2003, at 21:50 Europe/Amsterdam, Jack Jansen wrote:
>
> On dinsdag, mei 6, 2003, at 18:27 Europe/Amsterdam, Ronald Oussoren
> wrote:
>>> CFStringCreateWithCharacters expects a unicode string. The Python
>>> format specifier for unicode strings accepts "normal" strings, and
>>> interpretes them as a binary data stream containing UTF16 unicode >
>>> data.
>>
>> Very usefull :-( Is this documented anywhere? The documentation in
>> the section "Extracting Parameters in Extension Functions" of
>> "Extending and Embedding the Python Interpreter" does note mention
>> this misfeature.
>
> I'm not sure whether it's documented. If it isn't please file a bug
> report.
>
> And, about this being a misfeature: in some cases it definitely is,
> but in others it's definitely a feature. It really depends on whether
> you want to just pass raw data through (in which case it's a feature)
> or whether the data is interpreted (think filenames and such), in
> which case you'd much rather have the 8-bit string converted to
> unicode with the current default encoding.
The reason I think this is a misfeature is that is behaves completely
different from the default unicode conversion:
unicode(val) is equivalent to val.decode('ascii')
PyArg_Parse("u",...) is equivalent to val.decode('utf-16')
(both if isistance(val, str)).
I'll file a bugreport.
Ronald
More information about the Pythonmac-SIG
mailing list