[Numpy-discussion] `missing` argument in genfromtxt only a string?

Tim Michelsen timmichelsen at gmx-topmail.de
Tue Sep 15 16:30:35 EDT 2009


> I actually figured out a workaround with converters, since my missing
> values are " ","  ","   " ie., irregular number of spaces and the
> values aren't stripped of white spaces.  I just define {# : lambda s:
> float(s.strip() or 0)}, and I have a loop build all of the converters,
> but then I have to go through and drop the ones that are supposed to
> be strings or dates, which is still pretty tedious, since I have a
> number of datasets that are like this, but they all contain different
> data in different orders and there's no (computer) logical order to it
> that I've discovered yet.
Glad that you brought this up.

I posted a similar question recently:
http://thread.gmane.org/gmane.comp.python.numeric.general/32511


>>> All of the missing values in the second observation are now -1.  Also,
>>> I'm having trouble defining a converter for my dates.
I had a lot of timeseries code developed before Pierre created the 
marvelous tsfromtxt after the numpy 1.3 upgrade.
Now, you do not need the np.loadtxt => ts.time_series again. I only have 
to adapt the old code some day...
I was actually thinking of creating a converter library. When you work 
with measurement logger data, hardly any data complys with python/numpy 
expected inputs.
And they all think they do it for a reason.
As an example, many count fours of the day from 1-24 instead of 0-23. 
Just to indicate that the values refer to the _end_ or the averaging 
interval...
You can get around this. But it gets difficltier when the data is in 
15min. time steps:
0:00:00
0:15:00
[...]
23:45:00
24:00:00

So each data set is individual in this sense. That really bothers me at 
time.
We shall all thank for having genfromtxt and derived!




More information about the NumPy-Discussion mailing list