How to waste computer memory?

Marko Rauhamaa marko at pacujo.net
Fri Mar 18 17:28:48 EDT 2016


Chris Angelico <rosuav at gmail.com>:

> The problem is not Python's Unicode strings, then. The problem is the
> notion that path names are text. If they're text, they should be
> exclusively text (although, for low-level efficiency, they're more
> likely to be defined as "valid UTF-8 sequences" rather than "sequences
> of Unicode codepoints"); since they're not, they are fundamentally
> bytes. But that's not a problem with Python - it's a problem with the
> file system.

The file system does not have a problem. Python has a problem because it
tries to present pathnames as Unicode strings, which isn't always
possible.

The standard input and output are even more problematic because Python
very strongly "wishes" them to be Unicode streams.

If I were to start a new OS today, I would very much like to placate the
likes of Python. Unfortunately, the sins of our forefathers cannot be
wished away.

Anyway, Python is careful not to paint itself in a corner. It gives you
everything you need to break the abstraction and go low-level. It even
offers regular expressions and ASCII syntax for bytes objects! For
example, Guile 2.x is trying to emulate Python's progressive approach,
but doesn't offer such amenities. Thus, Python's

   b'hi!'

is

   #vu8(104 105 33)

or

   (use-modules (rnrs bytevectors))
   (string->utf8 "hi!")

in Guile.


Marko



More information about the Python-list mailing list