Converting from PyUnicodeObject to char * without calling C API

MRAB google at mrabarnett.plus.com
Tue Jan 13 10:35:48 EST 2009


skip at pobox.com wrote:
> I'm trying to convert Python's gdbinit file to Python 3.  One of the things
> it does is print filenames and function names when displaying stack frames.
> This worked fine in Python 2 because the type of such objects is
> PyStringObject which uses NUL-terminated strings under the covers.  For
> example:
> 
>     set $__fn = (char *)((PyStringObject *)co->co_filename)->ob_sval
> 
> In Python 3 co->co_filename is a PyUnicodeObject pointer with the raw data
> encoded as either UCS2 or UCS4.  This presents problems when displaying
> strings:
> 
>     (gdb) set $__f = (PyUnicodeObject *)(co->co_filename)
>     (gdb) p *$__f->str@$__f->length
>     $14 = {47, 85, 115, 101, 114, 115, 47, 115, 107, 105, 112, 47, 115, 114, 99, 47, 112, 121, 116, 104, 111, 110, 47, 112, 121, 51, 107, 45, 116, 47, 76, 105, 98, 47, 95, 119, 101, 97, 107, 114, 101, 102, 115, 101, 116, 46, 112, 121}
> 
>     (gdb) p *(char *)($__f->str)@$__f->length
>     $15 = "/\000U\000s\000e\000r\000s\000/\000s\000k\000i\000p\000/\000s\000r\000c\000/\000p\000y\000t\000h\000o\000n\000/\000p"
> 
>     (gdb) p *(char *)($__f->str)@($__f->length*2)
>     $16 = "/\000U\000s\000e\000r\000s\000/\000s\000k\000i\000p\000/\000s\000r\000c\000/\000p\000y\000t\000h\000o\000n\000/\000p\000y\0003\000k\000-\000t\000/\000L\000i\000b\000/\000_\000w\000e\000a\000k\000r\000e\000f\000s\000e\000t\000.\000p\000y"
> 
> 
> I'd like to get rid of those NULs when displaying names.  Making it more
> difficult, I'd like to do it without calling any C API functions.  If at all
> possible the user-defined commands should work even if there is no process
> available.  (e.g., when working with core files).
> 
Should you be using "char *" when they aren't char? Is there a wide char 
type of some sort?



More information about the Python-list mailing list