[Python-ideas] Fix default encodings on Windows

Steve Dower steve.dower at python.org
Thu Aug 11 11:31:46 EDT 2016


Unless someone else does the implementation, I'd rather add a utf8-readsig encoding that initially only skips a utf8 BOM - notably, you always get the same encoding, it just sometimes skips the first three bytes.

I think we can change this later to detect and switch to utf16 without it being disastrous, though we've made it this far without it and frankly there are good reasons to "encourage" utf8 over utf16.

My big concern is the console... I think that change is inevitably going to have to break someone, but I need to map out the possibilities first to figure out just how bad it'll be.

Top-posted from my Windows Phone

-----Original Message-----
From: "Random832" <random832 at fastmail.com>
Sent: ‎8/‎11/‎2016 7:54
To: "python-ideas at python.org" <python-ideas at python.org>
Subject: Re: [Python-ideas] Fix default encodings on Windows

On Thu, Aug 11, 2016, at 10:25, Steven D'Aprano wrote:
> > Interesting. Are you assuming that a text file cannot be empty?
> 
> Hmmm... not consciously, but I guess I was.
> 
> If the file is empty, how do you know it's text?

Heh. That's the *other* thing that Notepad does wrong in the opinion of
people coming from the Unix world - a Windows text file does not need to
end with a [CR]LF, and normally will not.

> But we're getting off topic here. In context of Steve's suggestion, we 
> should only autodetect UTF-8. In other words, if there's a UTF-8 BOM, 
> skip it, otherwise treat the file as UTF-8.

I think there's still room for UTF-16. It's two of the four encodings
supported by Notepad, after all.
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160811/798bdaad/attachment-0001.html>


More information about the Python-ideas mailing list