PyArg_ParseTuple and Unicode

M.-A. Lemburg mal at lemburg.com
Mon Oct 22 15:29:31 EDT 2001


Scottie wrote:
> 
> I  am confused here, but this is not uncommon.  I am trying to get an
> extension to handle both normal strings and unicode.  From my initial
> reading of the PyArg_ParseTuple document, I thought the following
> would work:
> 
> if( PyArg_ParseTuple(arg, "...u#...", ...) ) {
>     ...Unicode ops...
> }
> else if( PyArg_ParseTuple(arg, "...s#...", ...) ) {
>     ...plain string ops...
> }
> 
> My understanding of "u" and "u#" was that they would fail on non-
> unicode input (while "s" and "s#" pass both along).  The behavior I
> see is different: "u#" gives me a pointer to the base of a vanilla
> string, but divides the length by two.

"u#" behaves just like "s#" in this case: it uses the read buffer
interface and assumes that the object in question returns native
Unicode data through this interface.

In your situation, it's better to use the "O" parser marker
which returns the object and then switch on the object type
using PyString_Check(obj) / PyUnicode_Check(obj). 

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/





More information about the Python-list mailing list