[I18n-sig] Re: [Python-Dev] Pre-PEP: Python Character Model

Paul Prescod paulp@ActiveState.com
Tue, 06 Feb 2001 16:07:47 -0800


"Martin v. Loewis" wrote:
> 
> ...
>
> I'm certainly using characters > 128. In UTF-8, they would become
> multi-byte. I'm not certain whether this would cause a problem; you
> did not give all implementation details of your approach, so it is
> hard to say.

I think this is specified properly in the PEP but I know it is way too
much learn in one day so I'm not blaming you. I'm just pointing out that
it isn't as underspecified as it seems:

    Python already has a rule that allows the automatic conversion
    of characters up to 255 into their C equivalents. Once the Python
    character type is expanded, characters outside of that range should
    trigger an exception (just as converting a large long integer to a
    C int triggers an exception).

> For example, f.write would use the s# conversion (since the file was
> opened in binary). What exactly would that do?

Answer above.

> If your change would be to *just* widen the internal representation of
> characters, it would do PyString_AS_STRING/PyString_GET_SIZE, so it
> would return a pointer to the internal representation. 

Is it a requirement that PyString_AS_STRING return a pointer to the
internal representation instead of a narrowed equivalent?

 Paul Prescod