[SciPy-user] converting 01-24h logger data for timeseries.date_array

Mon Jun 16 18:16:35 EDT 2008

Hello,
I have an array with date and time and data values.
I would like to read it into a timeseries. Therefore I would like to 
read the time information in the data file into a list of dates in order 
to create a timeseries.date_array.

Unfortunately, many logging devices put the data out in hours 1-24 
format (see A below).

How can I reformat this into a format (B below) that's acceped by 
datetime.datetime?

Example:

* How to read in such a array:

### A: orignial data ###

DATE; VALUE (tab separated)
03.08.99	23:50:00	10.
03.08.99	24:00:00	11.
04.08.99	00:10:00	10.5

###B: needed data for datetime ###

DATE; VALUE (tab separated)
03.08.99	23:50:00	10.
04.08.99	20:00:00; 	11.;
04.08.99	00:10:00	10.5

###

### Code I use to read the data:
data_in = numpy.loadtxt(input_file, dtype=numpy.str_, skiprows=1)
dates_list = ["%s %s" % (d[0], d[1]) for d in data_in]
dates_dt = [(datetime.datetime.strptime(d, "%d.%m.%Y %H:%M:%S")) for d 
in dates_list]
date_arr = ts.date_array(dates_dt, freq='minute')
series_in = ts.time_series(data_in[:,3].astype(numpy.float_), date_arr, 
mask=(data_in[:,3]==nodata_string_input), freq='minute')

=> Now, if I read in the raw data A the row
"03.08.99	24:00:00	11."
gets pre-pended to all values of the day 03.08.99 because it get's 
parsed as hour 0 of that day.
In reality it's hour 0 of day 04.08.99. Therefore I have to reformat the 
raw data to the way represented by B.
My current workaround is to open the file in a spreadsheet application 
and save it as ascii again. but I would prefer a python only solution 
because there are data sets which even don't fit in spreadsheets due to 
their length.

So far, I could find a way to do this efficently. I would really 
appreciate if someone could point me into a direction on how to achieve 
this.

Kind regards,
Timmie