[Python-ideas] What is happening with array.array('u') in Python 4?

Stefan Behnel stefan_ml at behnel.de
Fri May 8 14:50:36 CEST 2015


Jonathan Slenders schrieb am 08.05.2015 um 14:16:
> What will happen to array.array('u') in Python 4? It is deprecated right
> now.
> I remember reading about mutable strings somewhere, but I forgot, and I
> can't find the discussion.
> 
> In any case, I need to have a mutable character array, for efficient
> manipulations. (Not a byte array.)
> And I need to be able to use the "re" module to search through it.
> array.array('u') works great in Python 3.

Well, for some value of "great" and "works". The problems are that 1) 'u'
has a platform dependent size of 16 or 32 bits and 2) it does not match the
internal representation of unicode strings. It will thus use surrogate
pairs on some platforms and not on others, and converting between Unicode
strings and arrays requires an encoding/decoding step. And it also does not
seem like the "re" module currently supports searching in unicode arrays
(everything else would have been very surprising).

ISTM that your best bet is currently to look for a suitable module on PyPI
that implements mutable character arrays. I'm sure you're not the only one
who needs something like that. The usual suspect would be NumPy, but there
may be smaller and simpler tools available.

Stefan




More information about the Python-ideas mailing list