[Numpy-discussion] using loadtxt to load a text file in to a numpy array

Chris Barker chris.barker at noaa.gov
Thu Jan 23 20:09:28 EST 2014


On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com>wrote:

> On 23 January 2014 21:51, Chris Barker <chris.barker at noaa.gov> wrote:
> >
> > However, I would prefer latin-1 -- that way  you  might get garbage for
> the
> > non-ascii parts, but it wouldn't raise an exception and it round-trips
> > through encoding/decoding. And you would have a somewhat more useful
> subset
> > -- including the latin-language character and symbols like the degree
> > symbol, etc.
>
> Exceptions and error messages are a good thing! Garbage is not!!!  :)
>

in principle, I agree with you, but sometime practicality beets purity.

in py2 there is a lot of implicit encoding/decoding going on, using the
system encoding. That is ascii on a lot of systems. The result is that
there is a lot of code out there that folks have ported to use unicode, but
missed a few corners. If that code is only testes with ascii, it all seems
o be working but then out in the wild someone
puts another character in there and presto -- a crash.

Also, there are places where the inability to encode makes silent message
-- for instance if an Exception is raised with a unicode message, it will
get silently dropped when it comes time to display on the terminal. I spent
quite a wile banging my head against that one recently when I tried to
update some code to read unicode files. I would have been MUCH happier with
a bit of garbage in the mesae than having it drop (or raise
an encoding error in the middle of the error...)

I think this is a bad thing.

The advantage of latin-1 is that while  you might get something that
doesn't print right, it won't crash, and it won't contaminate the data, so
comparisons, etc, will still work. kind of like using utf-8 in an old-style
c char array -- you can still passi t around and copare it, even if the
bytes dont mean what you think they do.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140123/d341abc5/attachment.html>


More information about the NumPy-Discussion mailing list