[Tutor] how can I use unicode in ctypes?

Albert-Jan Roskam fomcl at yahoo.com
Thu Dec 6 20:39:46 CET 2012


> Hi,
>  
> I am using ctypes to get and set data, among which, sometimes, unicode data. I
> was looking for a clean way  to encode and decode basestrings.
> The code below illustrates the problem.
>  
> import ctypes
>
> s = u'\u0627\u0644\u0633\u0644\u0627\u0645
> \u0639\u0644\u064a\u0643\u0645'
> good = ctypes.c_char_p(s.encode("utf-8"))
> bad = ctypes.c_char_p(s)
> print good, bad
> # prints:
> c_char_p('\xd8\xa7\xd9\x84\xd8\xb3\xd9\x84\xd8\xa7\xd9\x85
> \xd8\xb9\xd9\x84\xd9\x8a\xd9\x83\xd9\x85')
> c_char_p('?????? ?????')
>
> I find it ugly to encode and decode the strings everywhere in my code. Moreover,
> the strings are usually contained in dictionaries, which would make it even
> uglier/ more cluttered.
> So I wrote a @transcode decorator:
> http://pastecode.org/index.php/view/29608996 ... only to discover that brrrrrr,
> this is so complicated! (it works though).
> Is there a simpler solution?

Hmmm, I just simply used c_wchar_p, instead of c_char_p. And that seems to work. I thought the C prototype "const char *s" corresponds with c_char_p only (c_wchar_p corresponds to wchar_t * (NUL terminated) http://docs.python.org/2/library/ctypes.html). Weird.

<START cognitive_dissonance_reduction> 

"Well, at least I learnt something from that juicy decorator code I wrote: http://pastecode.org/index.php/view/29608996 

</END  cognitive_dissonance_reduction> ;-)

import ctypes
s = u'\u0627\u0644\u0633\u0644\u0627\u0645'
v = ctypes.c_wchar_p(s)
print v  # prints c_wchar_p(u'\u0627\u0644\u0633\u0644\u0627\u0645')
v.value  # prints u'\u0627\u0644\u0633\u0644\u0627\u0645'



More information about the Tutor mailing list