python 3.3 repr

Robin Becker robin at reportlab.com
Fri Nov 15 09:43:17 EST 2013


..........
> I'm still stuck on Python 2, and while I can understand the controversy ("It breaks my Python 2 code!"), this seems like the right thing to have done.  In Python 2, unicode is an add-on.  One of the big design drivers in Python 3 was to make unicode the standard.
>
> The idea behind repr() is to provide a "just plain text" representation of an object.  In P2, "just plain text" means ascii, so escaping non-ascii characters makes sense.  In P3, "just plain text" means unicode, so escaping non-ascii characters no longer makes sense.
>

unfortunately the word 'printable' got into the definition of repr; it's clear 
that printability is not the same as unicode at least as far as the print 
function is concerned. In my opinion it would have been better to leave the old 
behaviour as that would have eased the compatibility.

The python gods don't count that sort of thing as important enough so we get the 
mess that is the python2/3 split. ReportLab has to do both so it's a real issue; 
in addition swapping the str - unicode pair to bytes str doesn't help one's 
mental models either :(

Things went wrong when utf8 was not adopted as the standard encoding thus 
requiring two string types, it would have been easier to have a len function to 
count bytes as before and a glyphlen to count glyphs. Now as I understand it we 
have a complicated mess under the hood for unicode objects so they have a 
variable representation to approximate an 8 bit representation when suitable etc 
etc etc.

> Some of us have been doing this long enough to remember when "just plain text" meant only a single case of the alphabet (and a subset of ascii punctuation).  On an ASR-33, your C program would print like:
>
> MAIN() \(
> 	PRINTF("HELLO, ASCII WORLD");
> \)
>
> because ASR-33's didn't have curly braces (or lower case).
>
> Having P3's repr() escape non-ascii characters today makes about as much sense as expecting P2's repr() to escape curly braces (and vertical bars, and a few others) because not every terminal can print those.
>
.....
I can certainly remember those days, how we cried and laughed when 8 bits became 
popular.
-- 
Robin Becker




More information about the Python-list mailing list