[Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

Chris Barker - NOAA Federal chris.barker at noaa.gov
Thu Dec 7 21:10:36 EST 2017


I’m a bit confused:

File names and the like are one thing, and the CONTENTS of files is quite
another.

I get that there is theoretically a “default” encoding for the contents of
text files, but that is SO likely to be wrong as to be ignorable.

open() already defaults to utf-8. Which is a fine default if you are going
to have one, but it seems a bad idea to have it default to surrogateescape
EVER, regardless of the locale or anything else.

If the file is binary, or a different encoding, or simply broken, it’s much
better to get an encoding error as soon as possible.

Why does this have anything to do with the PEP?

Perhaps the issue of reading a filename from the system, writing it to a
file, then reading it back in again.

I actually do that a lot — but mostly so I can pass that file to another
system, so I really don’t want broken encoding in it anyway.

-CHB


Sent from my iPhone

On Dec 7, 2017, at 5:53 PM, Glenn Linderman <v+python at g.nevcal.com> wrote:

On 12/7/2017 5:45 PM, Jonathan Goble wrote:

On Thu, Dec 7, 2017 at 8:38 PM Glenn Linderman <v+python at g.nevcal.com>
wrote:

> If it were to be changed, one could add a text-mode option in 3.7, say "t"
> in the mode string, and a PendingDeprecationWarning for open calls without
> the specification of either t or b in the mode string.
>

"t" is already supported in open()'s mode argument [1] as a way to
explicitly request text mode, though it's essentially ignored right now
since text is the default anyway. So since the option is already present,
the only thing needed at this stage for your plan would be to begin
deprecating not using it.

*goes back to lurking*

[1] https://docs.python.org/3/library/functions.html#open


Thanks for briefly de-lurking.

So then for PEP 540... use surrogateescape immediately for t mode.

Then, when the user encounters an encoding error, there would be three
solutions: switch to t mode, explicitly switch to surrogateescape, or fix
the locale.

_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171207/190a49b0/attachment.html>


More information about the Python-Dev mailing list