[SciPy-User] Decimal dtype

Tue Jul 28 10:45:22 EDT 2015

On Tue, Jul 28, 2015 at 4:20 PM Todd <toddrjen at gmail.com> wrote:

> On Tue, Jul 28, 2015 at 4:09 PM, Anne Archibald <archibald at astron.nl>
> wrote:
>
>>
>> On Tue, Jul 28, 2015 at 3:32 PM Todd <toddrjen at gmail.com> wrote:
>>
>>> Traditional base-2 floating-point numbers have a lot of well-known
>>> issues.  The python standard library has a Decimal module that provides
>>> base-10 floating-point numbers, which avoid some (although not all) of
>>> these issues.
>>>
>>> Is there any possibility of numpy having one or more dtypes for base-10
>>> floating-point numbers?
>>>
>>> I understand fully if a lack of support from underlying libraries makes
>>> this infeasible at the present time.  I haven't been able to find much good
>>> information on the issue, which leads me to suspect the situation is
>>> probably not good.
>>>
>>
>> Is there a (hardware or not) fixed-size decimal format? Would that even
>> be useful?
>>
>>
> IEEE 754-2008 defines 32bit, 64bit, and 128bit floating-point decimal
> numbers.
>
> https://en.wikipedia.org/wiki/Decimal_floating_point#IEEE_754-2008_encoding
>

Given a reasonably-efficient library for manipulating these, it might be
useful to add them to numpy.

Numpy's arrays are most useful for working with fixed-size quantities of
>> homogeneous type for which operations are fast and can be carried out
>> without going through python. None of that would appear to be true for
>> decimals, even if one used a C-level decimal library.
>>
>
> If it stuck with IEEE decimal floating point numbers then it would still
> be fixed-size homogeneous data.
>
>
>> But numpy arrays can also be used to contain arbitrary python objects,
>> such as arbitrary-precision numbers, binary or decimal. They won't be all
>> that much faster than lists, but they do make most of numpy's array
>> operations available.
>>
>
> Those operations aren't vectorized, which eliminates a lot of the
> advantage.
>

Just to be clear: "vectorized" in this context means specifically, "the
inner loops are in C". This is different from what numpy.vectorize does
(every bottom-level operation goes through the python interpreter) or what
parallel programmers mean (actual SIMD in which the operation is carried
out in parallel). The disadvantage of going through python at the bottom
level is probably rather modest for numbers implemented in software - for
comparison, quad precision is about fifty times slower than long double
calculation even without python overhead.

Nevertheless, given a decent fixed-width decimal library it could certainly
be done to store them in numpy arrays. This does not necessarily mean
modifying numpy code (I looked into adding quad precision) - it is possible
to add a dtype in an extension library. For example, there is a numpy
quaternion library:
https://github.com/martinling/numpy_quaternion
and a numpy half-precision library:
https://github.com/mwiebe/numpy_half

Anne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20150728/5e607f4e/attachment.html>