[SciPy-User] How to specify the dtype in IO
Skipper Seabold
jsseabold at gmail.com
Wed Apr 21 17:15:06 EDT 2010
On Wed, Apr 21, 2010 at 2:32 PM, Leon Sit <wing1127aishi at gmail.com> wrote:
> Hi All
>
> I have a csv file with columns of the following dtype
>
> int8, int8, int8, float, float, ............
>
> and the column size is arbitrary. Is there a way to specific the dtype
> in genfromtxt() such that it knows the first three column is int and
> float otherwise?
>
I don't think that genfromtxt can handle arbitrary column lengths. I
am assuming that you don't have missing values but rather differing
(delimited) column lengths. If I had to do something like this and
had to use genfromtxt, I guess I would iterate through the file twice
and get the max column length and then insert delimiters for missing
data.
I usually know my column length n (which might be variable across
datasets/files but not rows) Then I do something like
dt = ['int']*3 + ['float']*(n-3)
You might be better off not using genfromtxt in this case(?) unless
someone has a better option than iterating through the file twice...
Skipper
More information about the SciPy-User
mailing list