[Python-Dev] Unicode 8.0 and 3.5
Steven D'Aprano
steve at pearwood.info
Fri Jun 19 05:33:12 CEST 2015
On Fri, Jun 19, 2015 at 01:55:07AM +0100, MRAB wrote:
> On 2015-06-19 00:56, Steven D'Aprano wrote:
> >At the very least, there is a change to the casefolding algorithm.
> >Cherokee was classified as unicameral but is now considered bicameral
> >(two cases, like English). Unusually, case-folding Cherokee maps to
> >uppercase rather than lowercase.
> >
> Doesn't the case-folding just depend on the data and the algorithm
> remains the same?
That depends on what algorithm str.casefold uses :-)
Case folding is specifically mentioned as something that people
migrating to Unicode 8 will need to take care with, and also says:
"This mapping also has consequences on identifiers, as described in the
changes to UAX #31, Unicode Identifier and Pattern Syntax."
http://unicode.org/versions/Unicode8.0.0/#Migration
--
Steve
More information about the Python-Dev
mailing list