[SciPy-Dev] Adding logsoftmax function to scipy.special

Sun Nov 17 13:51:35 EST 2019

> I would like to implement logsoftmax(x) as x-logsumexp(x)

The function seems reasonable to add, but I am not so sure about that
implementation, as it appears that it could lead to cancellation.
Note, for example, that because of the scaling `logsumexp` does to
prevent overflow, with that implementation you will end up adding and
subtracting the component `x_i` of the largest magnitude when
computing the `i`th component of the results array, see e.g.

https://github.com/scipy/scipy/blob/master/scipy/special/_logsumexp.py#L124

Off the top of my head, it is not obvious to me that situations
similar to that will not lead to cancellation errors.

Of course, I might be wrong, but my overall point is that in order to
use a particular implementation we should have a reasonable
understanding of its theoretical properties, so I would like to see
some exploration of that before proceeding.

- Josh

On Fri, Nov 15, 2019 at 5:17 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> Hi Takuya,
>
>
> On Tue, Nov 12, 2019 at 9:05 PM Takuya Koumura <koumura at cycentum.com> wrote:
>>
>> Hello,
>>
>> I raised a GitHub issue (#11058) and was suggested to post it to scipy-dev.
>>
>> I’m considering to send a PR to add logsoftmax function in scipy.special. Before that, I would like to hear your opinion (partly because it’s my first time to send a PR to scipy).
>
>
> Welcome! Thanks for proposing that. logsoftmax is fairly popular at least in deep learning, so it makes sense I think, and we already have a bunch of other log* functions in scipy.special.
>
> I noticed that both PyTorch and Tensorflow name this function `log_softmax` rather than `logsoftmax`. The latter would be a little more consistent with other functions (although we also have `special.log_ndtr`), while the former is consistent with other implementations of the same functionality. I'd be okay with either, with a slight preference for `log_softmax`.
>
>>
>> I would like to implement logsoftmax(x) as x-logsumexp(x). Actually, special.softmax(x) = np.exp(x-logsumexp(x)), so it is trivial for those who read the source code of softmax, but I think including logsoftmax as a separate function will be useful for other users. Logsoftmax is more accurate with inputs that make softmax saturate, eg: When x=[1000, 0], np.log(softmax(x))=[0, -Inf] (maybe depending on the floating point precision), while logsoftmax(x)=[0, -1000].
>>
>> I am planning to add the new function at the bottom of special/_logsumexp.py following the softmax function, and add some unit tests in special/test/test_logsumexp.py. If you have comments, I’d appreciate any.
>
>
> That seems like a good place.
>
> Cheers,
> Ralf
>
>>
>> Best wishes,
>> --
>> Takuya KOUMURA
>> koumura at cycentum.com
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev