[C++-sig] Re: Automatic PyUnicode to 'const char*'

Lijun Qin qinlj at solidshare.com
Sat Aug 2 02:10:30 CEST 2003


> To: c++-sig at python.org
> From: David Abrahams <dave at boost-consulting.com>
> Date: Thu, 31 Jul 2003 12:18:34 -0400
> Subject: [C++-sig] Re: Automatic PyUnicode to 'const char*'
> Reply-To: c++-sig at python.org
>
> Stefan Seefeld <seefeld at sympatico.ca> writes:
>
> > David Abrahams wrote:
> >> "Lijun Qin" <qinlj at solidshare.com> writes:
> >>
> >>>Hi all,
> >>>
> >>>I'm using boost.python to wrap the WTL (Windows Template Libaray),
using VC
> >>>7.1 and porting some old code previously use win32ui.pyd.
> >>>Basically it is easy to do, though gccxml failed to parse the ATL/WTL
code
> >>>so I can not use Pyste.
> >>>But there is a trouble, does anybody know how to automaticly convert
> >>>PyUnicode to 'const char *'? Without this, I have to change lot of code
> >>>lines to explictly use str() function.
> >> I'm not an expert in it, but I thought Unicode used 16- or 32- byte
> >> wchar_t characters.  How would you convert it to char const*?
> >
> > unicode allows different encodings, some (such as utf-8) with variably
> > sized character representations. This means that conversions usually
> > can't be done on-the-fly without helper constructs, i.e. some memory
> > management is needed.
> >
> > As an example, I'm doing such conversions in a xml library. The C
> > implementation uses 'xmlChar *', which is just an alias for 'char *',
> > but really holds utf-8 encoded text.
> > I convert it to various string types (the public API is parametrized
> > for the string type):
>
> To do that kind of narrowing, at the moment, the only way would be to
> write a thin wrapper function:
>
>       void f(char const*);
>
>       void f_thin_wrapper(object unicode)
>       {
>           str narrowed(unicode);
>           f(extract<char const*>(str));
>       }
>
> we may have some support for doing this automatically in the version
> of Boost.Python which integrates with Luabind, but that's a ways off
> yet.
>
> -- 
> Dave Abrahams
> Boost Consulting
> www.boost-consulting.com
>

Hi, all:

I have found a method to do this, by register a convert function:

inline void* convert_to_cstring_from_unicode(PyObject* obj)
{
    if (!PyUnicode_Check(obj)) return 0;
    PyObject* str = PyUnicode_AsEncodedString(obj, "mbcs", NULL);
     if (!str) throw_error_already_set();

     //We must release the str before we return to python, gard here
     static leak_gard _gard;

     unicode_str_map[obj] = str;
     return PyString_AsString(str);
}

converter::registry::insert(convert_to_cstring_from_unicode,
type_id<char>());

But the problem is that the PyString object must be freed when the call is
completed, I currently do this by applying a custom call policy to the
methods using 'const char*' type of parameters, but if there were reclusive
calls into the same fuction, it'll much complex.

In this procedure, I found that if we can apply the call policies before the
argument conversion procedure and in the context of the call policy object
(a call policy object is always attached with a method, right?), the problem
will be solved much easier and safer. We'll be able to replace the PyUnicode
object with a PyString object (in Windows platform, always MBCS encoding),
maybe save the original args tuple in the call policy object, then convert
the args to C++ (it'll success because the arg is a PyString) and call the
C++ function, when postcall(args, result) called, we can just release the
newly allocated args tuple, replace it with the original one. This will give
us more control on how the args be processed.

Lijun Qin
http://www.solidshare.com







More information about the Cplusplus-sig mailing list