[Python-Dev] unicode/string asymmetries
M.-A. Lemburg
mal@lemburg.com
Thu, 10 Jan 2002 09:49:32 +0100
Jack Jansen wrote:
>
> Recently, "M.-A. Lemburg" <mal@lemburg.com> said:
> > How about this: we add a wchar_t codec to Python and the "eu#" parser
> > marker. Then you could write:
> >
> > wchar_t value = NULL;
> > int len = 0;
> > if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0)
> > return NULL;
>
> I like it! Even though I have to do the memory management myself (and
> have to think of the error case) it at least looks reasonable.
Good :-)
> I'm
> assuming here that if I pass a StringObject it will be unicode-encoded
> using the default encoding, and that unicode value will then be
> converted to wchar_t and put in value, right? Or, in other words,
> passing "a.out" will do the same as passing u"a.out"...
Yes.
> One minor misgiving is that this call will *always* copy the string,
> even if the internal coding of unicode objects is wchar_t. That's a
> bit of a nuisance, but we can try to fix that later.
Copying will always take place (either into a preallocated buffer
or one which the PyArg_ParseTuple() API allocates), but then:
that's the cost you have to pay for the simplicity of the approach.
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting: http://www.egenix.com/
Python Software: http://www.egenix.com/files/python/