python 3.3 repr

Ned Batchelder ned at nedbatchelder.com
Fri Nov 15 10:08:23 EST 2013


On Friday, November 15, 2013 9:43:17 AM UTC-5, Robin Becker wrote:
> Things went wrong when utf8 was not adopted as the standard encoding thus 
> requiring two string types, it would have been easier to have a len function to 
> count bytes as before and a glyphlen to count glyphs. Now as I understand it we 
> have a complicated mess under the hood for unicode objects so they have a 
> variable representation to approximate an 8 bit representation when suitable etc 
> etc etc.
> 

Dealing with bytes and Unicode is complicated, and the 2->3 transition is not easy, but let's please not spread the misunderstanding that somehow the Flexible String Representation is at fault.  However you store Unicode code points, they are different than bytes, and it is complex having to deal with both.  You can't somehow make the dichotomy go away, you can only choose where you want to think about it.

--Ned.

> -- 
> Robin Becker




More information about the Python-list mailing list