[Numpy-discussion] dtype from |S10 to object in array?

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Sep 22 08:14:34 EDT 2010


On Wed, Sep 22, 2010 at 2:04 AM, keekychen.shared
<keekychen.shared at gmail.com> wrote:
>  Dear All,
>
> See below code pls,
>
> import sicpy
> import numpy as np
>
> x = np.zeros((2,),dtype=('i4,f4,a10'))
> x[:] = [(1,2.,'Hello'),(2,3.,"World")]
>
> y = x['f2']
> #array(['Hello', 'World'],
>      dtype='|S10')
>
> x['f2'] = y
> x
> #array([(1, 2.0, 'Hello'), (2, 3.0, 'World')],
>      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '|S10')])
>
> y = y.astype('object')
> y
> array([Hello, World], dtype=object)
>
>
> x['f2'] = y
> array([(1, 2.0, 'HellWorld'), (2, 3.0, '\x00\x00\x00\x00\x00\x00\x18')],
>      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '|S10')])
>
> ##here comes the problem: the f2 col type has not been changed and the
> data is not I wanted...
> ----------------------------------------------------------------------------
>
> here is why I need using this:
> suppose I have a datasource, csv, sql based db or what ever look like this:
>
> 1, 2.0, 'Hello'
> 2, 3.0, 'World'
> 3, 2.0, 'other string'
>
> I want to read them to a numpy array and process it's columns, it has no
> problem for processing the float or int type but string.
> After reading the manual and found the object dtype may store variable
> string then I want to exact the string col into an new array, try to
> process it then store back to the numpy "matrix" then store it back to
> the data source.
>
> May I know how I can do that? I do not care performance now.


I *guess* that you can do this by changing only the type of the string
column to object

dt = np.dtype([('f0', '<i4'), ('f1', '<f4'), ('f2', object)])


>>> x = np.zeros((2,),dtype=('i4,f4,O'))
>>> x[:] = [(1,2.,'Hello'),(2,3.,"World")]
>>> x
array([(1, 2.0, 'Hello'), (2, 3.0, 'World')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '|O4')])
>>> x['f2'] = 'hellooooooooooooooooo'
>>> x
array([(1, 2.0, 'hellooooooooooooooooo'), (2, 3.0, 'hellooooooooooooooooo')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '|O4')])


Josef

>
>
> Thanks for any hints
>
> Rgs,
>
> KC
>
>
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list