[Python-Dev] Unicode 8.0 and 3.5

Steven D'Aprano steve at pearwood.info
Fri Jun 19 05:33:12 CEST 2015


On Fri, Jun 19, 2015 at 01:55:07AM +0100, MRAB wrote:
> On 2015-06-19 00:56, Steven D'Aprano wrote:

> >At the very least, there is a change to the casefolding algorithm.
> >Cherokee was classified as unicameral but is now considered bicameral
> >(two cases, like English). Unusually, case-folding Cherokee maps to
> >uppercase rather than lowercase.
> >
> Doesn't the case-folding just depend on the data and the algorithm
> remains the same?

That depends on what algorithm str.casefold uses :-)

Case folding is specifically mentioned as something that people 
migrating to Unicode 8 will need to take care with, and also says:

"This mapping also has consequences on identifiers, as described in the 
changes to UAX #31, Unicode Identifier and Pattern Syntax."

http://unicode.org/versions/Unicode8.0.0/#Migration


-- 
Steve


More information about the Python-Dev mailing list