Chardet, file, ... and the Flexible String Representation

Mon Sep 9 15:27:55 EDT 2013

On Mon, Sep 9, 2013, at 15:03, Ian Kelly wrote:
> Do you mean that it breaks when overwriting Python string object buffers,
> or when overwriting arbitrary C strings either received from C code or
> created with create_unicode_buffer?
> 
> If the former, I think that is to be expected since ctypes ultimately
> can't
> know what is the actual type of the pointer it was handed -- much as in
> C,
> that's up to the programmer to get right. I also think it's very bad
> practice to be overwriting those anyway, since Python strings are
> supposed
> to be immutable.
> 
> If the latter, that sounds like a bug in ctypes to me.

I was talking about writing to the buffer object from python, i.e. with
slice assignment.
>>> s = 'Test \U00010000'
>>> len(s)
6
>>> buf = create_unicode_buffer(32)
>>> buf[:6] = s
TypeError: one character unicode string expected
>>> buf[:7] = s
ValueError: Can only assign sequence of same size
>>> buf[:7] = 'Test \ud800\udc00'
>>> buf[:7]
'Test \U00010000' # len = 6

Assigning with .value works, however, which may be a viable workaround
for most situations. The "one character unicode string expected" message
is a bit cryptic.