[Numpy-discussion] using loadtxt to load a text file in to a numpy array

Pauli Virtanen pav at iki.fi
Fri Jan 17 05:59:27 EST 2014


Julian Taylor <jtaylor.debian <at> googlemail.com> writes:
[clip]
> - inconvenience in dealing with strings in python 3.
> 
> bytes are not strings in python3 which means ascii data is either a byte
> array which can be inconvenient to deal with or 4 byte unicode which
> wastes space.
>
> A proposal to fix this would be to add a one or two byte dtype with a specific
> encoding that behaves similar to bytes but converts to string when outputting
> to python for comparisons etc.
>
> For backward compatibility we *cannot* change S. Maybe we could change
> the meaning of 'a' but it would be safer to add a new dtype, possibly
> 'S' can be deprecated in favor of 'B' when we have a specific encoding dtype.
> 
> The main issue is probably: is it worth it and who does the work?

I don't think this is a good idea: the bytes vs. unicode separation in
Python 3 exists for a good reason. If unicode is not needed, why not just
use the bytes data type throughout the program?

(Also, assuming that ASCII is in general good for text-format data is
quite US-centric.)

Christopher Barker wrote:
>
> How do you spell the dtype that 'S' give you????
>

'S' is bytes.

dtype='S', dtype=bytes, and dtype=np.bytes_ are all equivalent.

-- 
Pauli Virtanen




More information about the NumPy-Discussion mailing list