[SciPy-dev] SciPy Sprint results

Fri Dec 21 11:03:18 EST 2007

On Dec 21, 2007 1:20 AM, Charles R Harris <charlesr.harris at gmail.com> wrote:

>
>
> On Dec 19, 2007 11:28 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
> > Charles R Harris wrote:
> > >     On Dec 19, 2007 12:52 PM, Travis E. Oliphant <
> > oliphant at enthought.com
> > >     <mailto: oliphant at enthought.com>> wrote:
> >
> > >     >   * NumPy will get a standard binary file format (.npy/.npz) for
> > >     > arrays/groups_of_arrays.
> > >
> > > Will this new binary format contain endianess/type data? I am a bit
> > > concerned that we don't yet have a reliable way to distinguish
> > extended
> > > precision floats from the coming quad precision as they  both tend to
> > > use 128 bytes on 64 bit machines. Perhaps extended precision should
> > just
> > > be dropped at some point, especially as it is not particularly
> > portable
> > > between architectures/compilers.
> >
> > It uses the dtype.descr to describe the dtype of the array. You can see
> > the
> > implementation here:
> >
> >  http://svn.scipy.org/svn/numpy/branches/lib_for_io/format.py
> >
> > If it has holes, I would like to fix them. Can you point me to some
> > documentation on the different quad precision formats? Doesn't IEEE-854
> > standardize this?
> >
> > There has been some discussion about whether to continue with this
> > format or
> > attempt to read and write a very tiny subset of HDF5, so don't get too
> > attached
> > to the format until it hits the trunk. I'll drop some warnings into the
> > code to
> > that effect.
>
>
> Here is a PDF version of DRAFT Standard for Floating-Point Arithmetic IEEE
> P754 <http://www.validlab.com/754R/drafts/archive/2007-10-05.pdf> . Table
> 2 on page 16 gives a good summary of the proposed formats. I believe P754 is
> the latest step in the revision of the 754 standard as 754r concluded last
> year. Note that the proposed standard also adds decimal floats, with decimal
> digits packed 3 in 10 bits. Here is a bit more from Intel<http://www.intel.com/technology/itj/2007/v11i1/s2-decimal/1-sidebar.htm>,
> who are apparently working on implementations.
>

The actual _use_ of the floating types will depend on the C compilers. Quads
will probably show up as long doubles, displacing extended precision
doubles. What will happen to BLAS, LAPACK, and all those bits?  Quad
precision support will likely be added soon enough, there are already quad
versions of BLAS out there. Here is a interesting
presentation<http://www.csm.ornl.gov/workshops/SOS11/presentations/j_dongarra.pdf>somewhat
related to such things. Hey, maybe we should rewrite numpy in
FORTRAN ;)

Anyway, the current identification of numbers by bit width works fine for
stepping/slicing through data and is probably influenced by that
implementation detail, but as far as numerics go I think we need to know
what is actually *in* those bits and it would be nice to have that method in
place early on. Maybe a short UTF-8 header with enough room to actually be
descriptive would do the trick, or we could hash longer names to get a short
identifier, although hashed values would have to be recognized, not
decoded.  I think something wordy and descriptive, i.e.,  "big endian IEEE
binary128",  would be a good starting point.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20071221/c2a56e8e/attachment.html>