[SUSPICIOUS MESSAGE] Re: Cult-like behaviour [was Re: Kindness]

Chris Angelico rosuav at gmail.com
Mon Jul 16 11:23:47 EDT 2018


On Tue, Jul 17, 2018 at 12:54 AM, Gene Heskett <gheskett at shentel.net> wrote:
> On Monday 16 July 2018 10:24:28 Marko Rauhamaa wrote:
>
>> Antoon Pardon <antoon.pardon at vub.be>:
>> > I really don't understand why the author of that article didn't just
>> > copy his python2 program but used sys.stdin.buffer and
>> > sys.sydout.buffer instead of plain sys.stdin and stdout.
>>
>> Yes, it would be nice if you could simply restrict yourself to bytes
>> everywhere when your application needed it. Unfortunately, quite many
>> facilities demand text, and you will need to ponder carefully at each
>> such place how you deal with encoding/decoding exceptions.
>>
>> Plus the bytes syntax is really ugly. I wish Python3 had reserved
>> '...' for byte strings and "..." for UTF-32 strings.
>
> From a lurker, that does sound usefull. The next PEP maybe?

Definitely not. Mainly, this is a massive break of backward
compatibility; but even aside from that, I am NOT a fan of having
multiple string types with sneakily different meanings. Let's look at
a few other languages where the quote type changes the meaning:

* C and its direct derivatives: "string", 'c' 'h' 'a' 'r'. Completely
different (a character is an integer).

* JavaScript: "string", 'string', `formatted string`. Annoyingly
similar. Hard to do anything else though.

* SQL: 'string', "identifier", `broken MySQL identifier`. Constantly
tripping people up.

* Bourne shell: 'literal string', "interpolated string". Periodically
annoys people who "use quoted strings like this!"

All of them cause confusion, frequently. I think the C example causes
the least confusion, due to it being so completely different (you
can't write 'Hello, world' because that's too long for a string, and
C's static typing system is strong enough to catch most bugs of this
nature fairly quickly); all the others cause frequent problems. In
JavaScript's case, I kinda feel for the ECMAScript people in that they
wanted to add the feature but didn't really have any good options (JS
doesn't have string prefixes the way Python does), but it still causes
confusion; a backtick string in JS can span multiple lines, but the
others can't, so sometimes backticks are used even without
interpolation, and it's confusing. With SQL's different quoting types,
MySQL decided to go and violate the standard by making double quotes
into strings, but that just introduced even MORE confusion, rather
than solving anything. And shell scripting... well, if anyone truly
understands all the quoting and interpolation rules in bash, I would
be terrified of how many marbles that person has lost.

Python 2 had backticks meaning something completely different from
string literals. Starting with 3.0, backticks are explicitly excluded
from syntax, and the only way a quote character changes the string is
that three of them means you can span lines. Let's keep it simple.
Prefixes are there for a reason.

ChrisA



More information about the Python-list mailing list