Python usage numbers

Sat Feb 11 21:36:52 EST 2012

On Feb 11, 8:23 pm, Steven D'Aprano <steve
+comp.lang.pyt... at pearwood.info> wrote:
> On Sun, 12 Feb 2012 12:28:30 +1100, Chris Angelico wrote:
> > On Sun, Feb 12, 2012 at 12:21 PM, Eric Snow
> > <ericsnowcurren... at gmail.com> wrote:
> >> However, in at
> >> least one current thread (on python-ideas) and at a variety of times in
> >> the past, _some_ people have found Unicode in Python 3 to make more
> >> work.
>
> > If Unicode in Python is causing you more work, isn't it most likely that
> > the issue would have come up anyway?
>
> The argument being made is that in Python 2, if you try to read a file
> that contains Unicode characters encoded with some unknown codec, you
> don't have to think about it. Sure, you get moji-bake rubbish in your
> database, but that's the fault of people who insist on not being
> American. Or who spell Zoe with an umlaut.

That's not the worst of it... i have many times had a block of text
that was valid ASCII except for some intermixed Unicode white-space.
Who the hell would even consider inserting Unicode white-space!!!

> "I have a file containing text. I can open it in an editor and see it's
> nearly all ASCII text, except for a few weird and bizarre characters like
> £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an
> error. What should I do that requires no thought?"
>
> Obvious answers:

the most obvious answer would be to read the file WITHOUT worrying
about asinine encoding.