[SciPy-Dev] np.savetxt: apply patch in enhancement ticket 1079 to add headers?

Stefan stefan.czesla at hs.uni-hamburg.de
Wed Jun 2 07:21:06 EDT 2010


> > If the header is given as a plane string
> > (such as envisaged in ticket 1079), the
> > user has to care for the correct formatting, in particular,
> > the user has to
> > supply the comment character(s) and the new line formatting.
> > This might be
> > against intuition, because many users will at first try to supply their
> > header(s) without specifying those formatting characters.
> > The result will be a
> > file not readable with numpy.loadtxt, and the error might
> > not be detected right
> > away.
> 
> I'm not sure I understand why I would want to specify a comment
> character for writing a csv file (unless of course I had some comments
> to add).

We are possibly talking about different things. In our approach of using
numpy.savetxt comments (preceeding the actual data) and a header
are essentially the same, such as in the following example.
Basically, we want to add some lines
of additional information at the top of the file written with
numpy.savetxt, and be able to recover the data with numpy.loadtxt
(for which the 'header' would
then be irrelevant, what may not be your intention, or is it?).

#Now comes the data
#column1 [kg] column2 [apple]
1  2
3  5


> 
> Also note that since that patch was written, savetxt takes a user
> supplied newline keyword, so you can just append that to the header
> string.
>
True, we were not aware of this, but this does not help much for the
comment/header. 
> >
> > As numpy.loadtxt has a default comment character ('#'), the same may be
> > implemented for numpy.savetxt. In this case, numpy.savetxt would get two
> > additional keywords (e.g. header, comment(character)), which bloats the
> > interface, but potentially provides more safety.
> >
> 
> FWIW, I ended up rolling my own using the most recent pre-Python 3
> changes for savetxt that accepts a list of names instead of one string
> or if the provided array has the attribute dtype.names (non-nested rec
> or structured arrays) it uses those.  Whatever is done I think the
> support for structured arrays is nice, and I think having this
> functionality is a no-brainer.  I need it quite often.
> 
Although, we have not been using record arrays too often, we see their
advantages and agree that it should be possible to use them as you described
it.
We also thought about a solution, using the __str__ method for the 'header
object'. In this vain, an arbitrary header class (including a plane string)
providing an __str__ member may be handed to numpy.savetxt,
which can use it to write the header. 

> Skipper
> 







More information about the SciPy-Dev mailing list