[Numpy-discussion] string arrays - accessing data from C++

Christopher Barker Chris.Barker at noaa.gov
Fri Sep 18 13:08:24 EDT 2009


Jaroslav Hajek wrote:

> Does PyArrayObject::data point to a single contiguous char[] buffer,
> like with the old Numeric char arrays, with
> PyArrayObject::descr->elsize being the maximum length?

yes.

> string lengths determined 

c-style null termination

> Finally, is there any way to create an array in NumPy (from within the
> interpreter) that would have type == PyArray_CHAR?

I think this will get you what you want:

a = np.empty((3,4), dtype=np.character)
or
a = np.empty((3,4), dtype='c')

You can learn a lot by experimenting at the command line (even better, 
ipython):

In [27]: a = np.array(('this', 'that','a longer string','s'))

In [28]: a
Out[28]:
array(['this', 'that', 'a longer string', 's'],
       dtype='|S15')


you can see that it is a dtype of '|S15', so each element can be up to 
15 bytes.

#which you can also fine this way:

In [30]: a.itemsize
Out[30]: 15

and, for a contiguous block, like this:

In [31]: a.strides
Out[31]: (15,)

# now to look at the bytes themselves:

In [37]: b = a.view(dtype=np.uint8).reshape((4,-1))

In [38]: b
Out[38]:
array([[116, 104, 105, 115,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0],
        [116, 104,  97, 116,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0],
        [ 97,  32, 108, 111, 110, 103, 101, 114,  32, 115, 116, 114, 105,
         110, 103],
        [115,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0]], dtype=uint8)


so you can see that it's null-terminated.

I find it very cool that you can get at virtually all the c-level info 
for an array from python.

HTH,

-Chris





-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list