[SciPy-User] How to specify the dtype in IO

Skipper Seabold jsseabold at gmail.com
Wed Apr 21 17:15:06 EDT 2010


On Wed, Apr 21, 2010 at 2:32 PM, Leon Sit <wing1127aishi at gmail.com> wrote:
> Hi All
>
> I have a csv file with columns of the following dtype
>
> int8, int8, int8, float, float, ............
>
> and the column size is arbitrary. Is there a way to specific the dtype
> in genfromtxt() such that it knows the first three column is int and
> float otherwise?
>

I don't think that genfromtxt can handle arbitrary column lengths.  I
am assuming that you don't have missing values but rather differing
(delimited) column lengths.  If I had to do something like this and
had to use genfromtxt, I guess I would iterate through the file twice
and get the max column length and then insert delimiters for missing
data.

I usually know my column length n (which might be variable across
datasets/files but not rows) Then I do something like

dt = ['int']*3 + ['float']*(n-3)

You might be better off not using genfromtxt in this case(?) unless
someone has a better option than iterating through the file twice...

Skipper



More information about the SciPy-User mailing list