[Numpy-discussion] Problem with importing csv into datetime64
Grové
grove.steyn at gmail.com
Wed Sep 28 12:15:44 EDT 2011
Hi,
I am trying out the latest version of numpy 2.0 dev:
np.__version__
Out[44]: '2.0.0.dev-aded70c'
I am trying to import CSV data that looks like this:
date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather
2007-01-01 00:30,481.9,,,,,481.9,15,SW,1040,Fine
2007-01-01 01:00,471.9,,,,,471.9,15,SW,1040,Fine
2007-01-01 01:30,455.9,,,,,455.9,,,,
etc.
by using the following code:
convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0),
2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s or
0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: float(s
or 0), 8: str, 9: str, 10: str}
dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt',
float), ('agt', float), ('sps', float) ,('eskom_import', float), ('temperature',
float), ('wind', str), ('pressure', float), ('weather', str)]
a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11),
names=True)
The dtype it generates for a.date is 'object':
array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200,
..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200,
2008-01-01T00:00+0200], dtype=object)
But I need it to be datetime64, like in this example (but including hrs and
minutes):
array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14',
'2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]')
It seems that the CSV import creates an embedded object datetype for 'date'
rather than a datetime64 data type. Any ideas on how to fix this?
Grové
More information about the NumPy-Discussion
mailing list