flaming vs accuracy [was Re: Performance of int/long in Python 3]

Thu Mar 28 13:48:33 EDT 2013

On 28 mar, 17:33, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Thu, Mar 28, 2013 at 7:34 AM, jmfauth <wxjmfa... at gmail.com> wrote:
> > The flexible string representation takes the problem from the
> > other side, it attempts to work with the characters by using
> > their representations and it (can only) fails...
>
> This is false.  As I've pointed out to you before, the FSR does not
> divide characters up by representation.  It divides them up by
> codepoint -- more specifically, by the *bit-width* of the codepoint.
> We call the internal format of the string "ASCII" or "Latin-1" or
> "UCS-2" for conciseness and a point of reference, but fundamentally
> all of the FSR formats are simply byte arrays of *codepoints* -- you
> know, those things you keep harping on.  The major optimization
> performed by the FSR is to consistently truncate the leading zero
> bytes from each codepoint when it is possible to do so safely.  But
> regardless of to what extent this truncation is applied, the string is
> *always* internally just an array of codepoints, and the same
> algorithms apply for all representations.

-----

You know, we can discuss this ad nauseam. What is important
is Unicode.

You have transformed Python back in an ascii oriented product.

If Python had imlemented Unicode correctly, there would
be no difference in using an "a", "é", "€" or any character,
what the narrow builds did.

If I am practically the only one, who speakes /discusses about
this, I can ensure you, this has been noticed.

Now, it's time to prepare the Asparagus, the "jambon cru"
and a good bottle a dry white wine.

jmf