[Numpy-discussion] Trim a numpy array in numpy.

Derek Homeier derek at astro.physik.uni-goettingen.de
Tue Aug 16 17:39:26 EDT 2011


Hi Hongchun,

On 16 Aug 2011, at 23:19, Hongchun Jin wrote:

> I have a question regarding how to trim a string array in numpy. 
> 
> >>> import numpy as np
> >>> x = np.array(['aaa.hdf', 'bbb.hdf', 'ccc.hdf', 'ddd.hdf'])
> 
> I expect to trim a certain part of each element in the array, for example '.hdf', giving me ['aaa', 'bbb', 'ccc', 'ddd']. Of course, I can do a loop thing. However, in my actual dataset, I have more than one million elements in such an array. So I am wondering is there a faster and better way to do it, like STRMID function in IDL?  I try to google it, but it turns out that I can not find any discussion about it.  Thanks. 
> 
For a case like above, if you really have all constant length strings and want to truncate to a fixed length, you could simply do 

x.astype('|S3')

For more complex cases like trimming regex patterns I can't think of a numpy solution right now, coding the loop in cython might be a better bet there...

Cheers,
					Derek




More information about the NumPy-Discussion mailing list