Cult-like behaviour [was Re: Kindness]

Marko Rauhamaa marko at pacujo.net
Sat Jul 14 18:15:24 EDT 2018


Chris Angelico <rosuav at gmail.com>:

> On Sun, Jul 15, 2018 at 5:54 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>> True enough. Modern-day protocols as well as Linux file formats and
>> commands intentionally blur the line between strings and bytes. The
>> software in question deals with all of the above. It is virtually
>> impossible to keep track of what is "really" text and what is "really"
>> binary. In the end, the Gordian Knot was sliced by using Python3's
>> strings for everything and restricting oneself to Latin-1 codepoints
>> (almost) everywhere.
>
> [...] By recommending and preferring eight-bit text strings, you're
> saying "Chinese text doesn't matter". And by stipulating Latin-1,
> you're also saying "Russian text doesn't matter" and "Thai text
> doesn't matter" and "Hebrew text doesn't matter" and more. You are
> declaring that YOUR culture is the only one that matters. When I see
> behaviour like that in a Twitch stream that I moderate, I smack it
> with a banhammer, because that is utterly unacceptable. Why should we
> tolerate it in programming?

I'm not saying that at all. What I'm saying is that I'm using Python3
strings as holders for bytes. Since every byte is a valid Unicode code
point, a Python3 string can hold any sequence of bytes.

Couldn't you use bytes objects everywhere for the same purpose?

Yes and no.

Yes, but it would be ugly as hell and would involve changing a large
percentage of the source code.

No, as a large number of Python3 facilities require str objects as
arguments. Consider urllib.request.urlopen(), for example, which
requires a URL to be an str object.


Marko



More information about the Python-list mailing list