flaming vs accuracy [was Re: Performance of int/long in Python 3]

Chris Angelico rosuav at gmail.com
Thu Mar 28 12:16:55 EDT 2013


On Fri, Mar 29, 2013 at 3:01 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 3/28/2013 10:38 AM, Chris Angelico wrote:
>
>> PEP393 strings have two optimizations, or kinda three:
>>
>> 1a) ASCII-only strings
>> 1b) Latin1-only strings
>> 2) BMP-only strings
>> 3) Everything else
>>
>> Options 1a and 1b are almost identical - I'm not sure what the detail
>> is, but there's something flagging those strings that fit inside seven
>> bits. (Something to do with optimizing encodings later?)
>
>
> Yes. 'Encoding' an ascii-only string to any ascii-compatible encoding
> amounts to a simple copy of the internal bytes. I do not know if *all* the
> codecs for such encodings are 393-aware, but I do know that the utf-8 and
> latin-1 group are. This is one operation that 3.3+ does much faster than
> 3.2-

Thanks Terry. So that's not so much a representation difference as a
flag that costs little or nothing to retain, and can improve
performance in the encode later on. Sounds like a useful tweak to the
basics of flexible string representation, without being particularly
germane to jmf's complaints.

ChrisA



More information about the Python-list mailing list