Converting from PyUnicodeObject to char * without calling C API

skip at pobox.com skip at pobox.com
Tue Jan 13 10:28:29 EST 2009


I'm trying to convert Python's gdbinit file to Python 3.  One of the things
it does is print filenames and function names when displaying stack frames.
This worked fine in Python 2 because the type of such objects is
PyStringObject which uses NUL-terminated strings under the covers.  For
example:

    set $__fn = (char *)((PyStringObject *)co->co_filename)->ob_sval

In Python 3 co->co_filename is a PyUnicodeObject pointer with the raw data
encoded as either UCS2 or UCS4.  This presents problems when displaying
strings:

    (gdb) set $__f = (PyUnicodeObject *)(co->co_filename)
    (gdb) p *$__f->str@$__f->length
    $14 = {47, 85, 115, 101, 114, 115, 47, 115, 107, 105, 112, 47, 115, 114, 99, 47, 112, 121, 116, 104, 111, 110, 47, 112, 121, 51, 107, 45, 116, 47, 76, 105, 98, 47, 95, 119, 101, 97, 107, 114, 101, 102, 115, 101, 116, 46, 112, 121}

    (gdb) p *(char *)($__f->str)@$__f->length
    $15 = "/\000U\000s\000e\000r\000s\000/\000s\000k\000i\000p\000/\000s\000r\000c\000/\000p\000y\000t\000h\000o\000n\000/\000p"

    (gdb) p *(char *)($__f->str)@($__f->length*2)
    $16 = "/\000U\000s\000e\000r\000s\000/\000s\000k\000i\000p\000/\000s\000r\000c\000/\000p\000y\000t\000h\000o\000n\000/\000p\000y\0003\000k\000-\000t\000/\000L\000i\000b\000/\000_\000w\000e\000a\000k\000r\000e\000f\000s\000e\000t\000.\000p\000y"


I'd like to get rid of those NULs when displaying names.  Making it more
difficult, I'd like to do it without calling any C API functions.  If at all
possible the user-defined commands should work even if there is no process
available.  (e.g., when working with core files).

Any suggestions?

-- 
Skip Montanaro - skip at pobox.com - http://smontanaro.dyndns.org/


More information about the Python-list mailing list