[Python-Dev] PEP 393 Summer of Code Project

Stefan Behnel stefan_ml at behnel.de
Tue Aug 23 11:32:44 CEST 2011


"Martin v. Löwis", 23.08.2011 10:55:
>> - “The UTF-8 decoding fast path for ASCII only characters was removed
>>    and replaced with a memcpy if the entire string is ASCII.”
>>    The fast path would still be useful for mostly-ASCII strings, which
>>    are extremely common (unless UTF-8 has become a no-op?).
>
> Is it really extremely common to have strings that are mostly-ASCII but
> not completely ASCII?

Maybe not as "extremely common" as pure ASCII strings, but at least for 
western European languages, "mostly ASCII" strings are very common indeed.

Stefan



More information about the Python-Dev mailing list