[Python-ideas] Python 3000 TIOBE -3%

Joao S. O. Bueno jsbueno at python.org.br
Sun Feb 12 22:32:53 CET 2012


On 11 February 2012 21:24, Paul Moore <p.f.moore at gmail.com> wrote:

> What I *don't* know is what those funny bits of
> mojibake I see in the text editor are.
>

So, do yourself and to us, "the rest of the world", a favor, and open the
file in binary mode.

Also, I'd suggest you and anyone being picky about encoding to read
http://www.joelonsoftware.com/articles/Unicode.html so you can finally have
in your mind that *** ASCII is not text *** .

It used to be text when to get to non-[A-Z|a-z] text you had to have
someone recording a file in  a tape, pack it in the luggage, and take a
plane to "overseas" to the U.S.A. . That is not the case anymore, and that,
as far as I understand, is the reasoning to Python 3 to default to unicode.

Anyone can work "ignoring text" and treating bytes as bytes, opening a file
in binary mode. You can use "os.linesep" instead of a hard-coded "\n" to
overcome linebreaking. (Of course you might accidentally break a line
inside a multi-byte character in some enconding, since you prefer to ignore
them altogether, but it should be rare).

  js
 -><-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120212/57499f7a/attachment.html>


More information about the Python-ideas mailing list