[SciPy-User] How to specify the dtype in IO

Wed Apr 21 16:20:03 EDT 2010

Leon Sit wrote:
> Hi All
>
> I have a csv file with columns of the following dtype
>
> int8, int8, int8, float, float, ............
>
> and the column size is arbitrary. Is there a way to specific the dtype
> in genfromtxt() such that it knows the first three column is int and
> float otherwise?
>
>   

There may be a simpler method, but it looks like a combination of the 
'dtype=None' and appropriate values in 'converters' would work.

For example, suppose this is "data.txt":
-----
index,category,ratio
1, 10, 1.234
2, 21, 4.553
3, 17, 5.113
4, 22, 2.220
-----

To create a structure array with dtypes int8, int32 and float:

In [24]: a = np.genfromtxt('data.txt', delimiter=',', names=True, 
dtype=None, converters={0: lambda s: np.int8(s)})

In [25]: a.dtype
Out[25]: dtype([('index', '|i1'), ('category', '<i4'), ('ratio', '<f8')])

In [26]: for row in a:
   ....:     print row
   ....:    
   ....:    
(1, 10, 1.234)
(2, 21, 4.5529999999999999)
(3, 17, 5.1130000000000004)
(4, 22, 2.2200000000000002)

For your data, you could set dtype=None and provide converters for 
columns 0, 1, and 2.

Warren

> Thanks_______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>