[Numpy-discussion] Adding POWER10 (VSX4) support to the SIMD framework

Nicholai Tukanov nicholaitukanov at gmail.com
Wed Jul 21 15:37:28 EDT 2021


I would like to understand how to go about extending the SIMD framework in
order to add support for POWER10. Specifically, I would like to add the
following instructions: `lxvp` and `stxvp` which loads/stores 256 bits
into/from two vectors. I believe that this will be able to give a decent
performance boost for those on POWER machines since it can halved the
amount of loads/stores issued.

Additionally, matrix engines (2-D SIMD instructions) are becoming quite
popular due to their performance improvements for deep learning and
scientific computing. Would it be beneficial to add these new advanced SIMD
instructions into the framework or should these instructions be left to
libraries such as OpenBLAS and MKL?

Thank you,
Nicholai Tukanov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210721/4019a971/attachment.html>


More information about the NumPy-Discussion mailing list