[I18n-sig] Re: [Python-Dev] Pre-PEP: Python Character Model
Paul Prescod
paulp@ActiveState.com
Tue, 06 Feb 2001 16:07:47 -0800
"Martin v. Loewis" wrote:
>
> ...
>
> I'm certainly using characters > 128. In UTF-8, they would become
> multi-byte. I'm not certain whether this would cause a problem; you
> did not give all implementation details of your approach, so it is
> hard to say.
I think this is specified properly in the PEP but I know it is way too
much learn in one day so I'm not blaming you. I'm just pointing out that
it isn't as underspecified as it seems:
Python already has a rule that allows the automatic conversion
of characters up to 255 into their C equivalents. Once the Python
character type is expanded, characters outside of that range should
trigger an exception (just as converting a large long integer to a
C int triggers an exception).
> For example, f.write would use the s# conversion (since the file was
> opened in binary). What exactly would that do?
Answer above.
> If your change would be to *just* widen the internal representation of
> characters, it would do PyString_AS_STRING/PyString_GET_SIZE, so it
> would return a pointer to the internal representation.
Is it a requirement that PyString_AS_STRING return a pointer to the
internal representation instead of a narrowed equivalent?
Paul Prescod