[Numpy-discussion] Empty strings not empty?

Matthew Brett matthew.brett at gmail.com
Wed Dec 30 14:00:50 EST 2009


Hi.

> It isn't empty:
>
> In [3]: array(['\x00']).dtype
> Out[3]: dtype('|S1')
>
> In [4]: array(['\x00']).tostring()
> Out[4]: '\x00'
>
> In [5]: array(['\x00'])[0]
> Out[5]: ''

No, but my problem was that an empty string is not empty either, and
that you can't therefore distinguish between an empty string and a
string with all 0 bytes:

In [11]: np.array('') == '\x00\x00\x00'
Out[11]: array(True, dtype=bool)

> Looks like a printing problem to me, something in __repr__ for the string
> array. It seems that trailing zeros are trimmed off.
>
> In [11]: array(['a\x00\x00'])
> Out[11]:
> array(['a'],
>       dtype='|S3')
>
> In [12]: array(['a\x00b'])
> Out[12]:
> array(['a\x00b'],
>       dtype='|S3')

I don't think it's a printing problem, I think it's that the trailing
zeros are pulled off in the string comparisons, and for printing, even
though they are present in memory.   I mean, that a.tostring() is
right, and the __repr__ and comparisons are - at least to me -
confusing.

In [2]: a = np.array('a\x00\x00\x00')

In [3]: a
Out[3]:
array('a',
      dtype='|S4')

In [5]: a == 'a'
Out[5]: array(True, dtype=bool)

In [7]: a == 'a\x00\x00\x00'
Out[7]: array(True, dtype=bool)

See you,

Matthew



More information about the NumPy-Discussion mailing list