[Numpy-discussion] Bytes vs. Unicode in Python3
Francesc Alted
faltet at pytables.org
Fri Nov 27 11:04:57 EST 2009
A Friday 27 November 2009 16:41:04 Pauli Virtanen escrigué:
> > > I think so. However, I think S is probably closest to bytes... and
> > > maybe S can be reused for bytes... I'm not sure though.
> >
> > That could be a good idea because that would ensure compatibility with
> > existing NumPy scripts (i.e. old 'string' dtypes are mapped to 'bytes',
> > as it should). The only thing that I don't like is that that 'S' seems
> > to be the initial letter for 'string', which is actually 'unicode' in
> > Python 3 :-/ But, for the sake of compatibility, we can probably live
> > with that.
>
> Well, we can "deprecate" 'S' (ie. never show it in repr, always only 'B'
> or 'U').
Well, deprecating 'S' seems a sensible option too. But why only avoiding
showing it in repr? Why not issue a DeprecationWarning too?
> > > Also, what will a bytes dtype mean within a py2 program context? Does
> > > it matter if the bytes dtype just fails somehow if used in a py2
> > > program?
> >
> > Mmh, I'm of the opinion that the new 'bytes' type should be available
> > only with NumPy for Python 3. Would that be possible?
>
> I don't see a problem in making a bytes_ scalar type available for
> Python2. In fact, it would be useful for making upgrading to Py3 easier.
I think introducing a bytes_ scalar dtype can be somewhat confusing for Python
2 users. But if the 'S' typecode is to be deprecated also for NumPy for
Python 2, then it makes perfect sense to introduce bytes_ there too.
--
Francesc Alted
More information about the NumPy-Discussion
mailing list