[Numpy-discussion] Extent of unicode types in numpy
Gerard Vermeulen
gerard.vermeulen at grenoble.cnrs.fr
Wed Feb 8 01:30:02 EST 2006
On Wed, 08 Feb 2006 01:41:18 -0700
Travis Oliphant <oliphant.travis at ieee.org> wrote:
> >Well, probably I've overlooked something, but I really think that this
> >would be a nice thing to do.
> >
> >
> There are details in the scalar-array conversions (getitem and setitem
> that would have to be implemented but it is possible. The UCS4 -->
> UTF-16 encoding is one of the easiest. It's done in unicodeobject.h in
> Python, but I'm not sure it's exposed other than going through the
> interpreter.
>
> Does this seem like a solution that everyone can live with?
>
Yes.
The only point that worries me a little bit that some problems are limited
by memory or memory bandwidth and for those cases UCS2 arrays are better
than UCS4 arrays.
I have run into memory problems before and I don't know if it will happen
for unicode strings. Time will tell.
Gerard
More information about the NumPy-Discussion
mailing list