[Numpy-discussion] ANN: xtensor and xtensor-python

Sylvain Corlay sylvain.corlay at gmail.com
Fri Nov 18 05:34:21 EST 2016


Hi all,

We are pleased to announce the release of the xtensor
<https://github.com/QuantStack/xtensor> library and its python bindings
xtensor-python <https://github.com/QuantStack/xtensor-python>, by Johan
Mabille (@JohanMabille) and Sylvain Corlay (@SylvainCorlay).
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#1-xtensor>
1. xtensor

xtensor <https://github.com/QuantStack/xtensor> is a C++ template library
for manipulating multi-dimensional array expressions.

It provides

   - an API following the idioms of the *C++ standard library*.
   - an extensible expression system enabling lazy broadcasting and universal
   functions.
   - python bindings for operating inplace on numpy arrays in your C++
   thanks to Python's buffer protocol and pybind.

More details on lazy computing with xtensor are available below.
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#2-xtensor-cookiecutter>
2. xtensor-cookiecutter

Besides, xtensor <https://github.com/QuantStack/xtensor> and xtensor-python
<https://github.com/QuantStack/xtensor-python>, we provide a cookiecutter
template project to help extension authors get started. The
xtensor-cookiecutter <https://github.com/QuantStack/xtensor-python> generates
a simple project for a Python C++ extension with

   - a working setup.py compiling the extension module
   - a few example of functions making use of xtensor exposed to python,
   and the example of a vectorized function exposed to python.
   - all the boilerplate for the setup of the unit testing and generation
   of the HTML documentation for these examples.

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#3-you-can-try-it-now>3.
You can try it now!

You can try the xtensor live on the project website
<http://quantstack.net/xtensor>. The Try it Now button is powered by

   - The Cling C++ interpreter.
   - The Jupyter notebook.
   - The Binder project.

------------------------------
------------------------------
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#getting-started>Getting
started

xtensor requires a modern C++ compiler supporting C++14. The following C+
compilers are supported:

   - On Windows platforms, Visual C++ 2015 Update 2, or more recent
   - On Unix platforms, gcc 4.9 or a recent version of Clang

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#installation>
Installation

xtensor and xtensor-python are header-only libraries. We provide packages
for the conda package manager.

conda install -c conda-forge xtensor         # installs xtensor
conda install -c conda-forge xtensor-python  # installs xtensor and
xtensor-python

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#usage>
Usage
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#basic-usage>Basic
Usage

Initialize a 2-D array and compute the sum of one of its rows and a 1-D
array.

#include <iostream>
#include "xtensor/xarray.hpp"
#include "xtensor/xio.hpp"

xt::xarray<double> arr1
  {{1.0, 2.0, 3.0},
   {2.0, 5.0, 7.0},
   {2.0, 5.0, 7.0}};

xt::xarray<double> arr2
  {5.0, 6.0, 7.0};

xt::xarray<double> res = xt::make_xview(arr1, 1) + arr2;

std::cout << res;

Outputs:

{7, 11, 14}

Initialize a 1-D array and reshape it inplace.

#include <iostream>
#include "xtensor/xarray.hpp"
#include "xtensor/xio.hpp"

xt::xarray<int> arr
  {1, 2, 3, 4, 5, 6, 7, 8, 9};

arr.reshape({3, 3});

std::cout << arr;

Outputs:

{{1, 2, 3},
 {4, 5, 6},
 {7, 8, 9}}

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#lazy-broadcasting-with-xtensor>Lazy
Broadcasting with xtensor

We can operate on arrays of different shapes of dimensions in an
elementwise fashion. Broadcasting rules of xtensor are similar to those of
numpy <http://www.numpy.org/> and libdynd <http://libdynd.org/>.
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#broadcasting-rules>Broadcasting
rules

In an operation involving two arrays of different dimensions, the array
with the lesser dimensions is broadcast across the leading dimensions of
the other.

For example, if A has shape (2, 3), and B has shape (4, 2, 3), the result
of a broadcasted operation with A and B has shape (4, 2, 3).

   (2, 3) # A
(4, 2, 3) # B
---------
(4, 2, 3) # Result

The same rule holds for scalars, which are handled as 0-D expressions. If
matched up dimensions of two input arrays are different, and one of them
has size 1, it is broadcast to match the size of the other. Let's say B has
the shape (4, 2, 1) in the previous example, so the broadcasting happens as
follows:

   (2, 3) # A
(4, 2, 1) # B
---------
(4, 2, 3) # Result

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#universal-functions-laziness-and-vectorization>Universal
functions, Laziness and Vectorization

With xtensor, if x, y and z are arrays of *broadcastable shapes*, the
return type of an expression such as x + y * sin(z) is not an array. It is
an xexpression object offering the same interface as an N-dimensional
array, which does not hold the result. Values are only computed upon access
or when the expression is assigned to an xarray object. This allows to
operate symbolically on very large arrays and only compute the result for
the indices of interest.

We provide utilities to vectorize any scalar function (taking multiple
scalar arguments) into a function that will perform on xexpressions,
applying the lazy broadcasting rules which we just described. These
functions are called *xfunction*s. They are xtensor's counterpart to
numpy's universal functions.

In xtensor, arithmetic operations (+, -, *, /) and all special functions
are *xfunction*s.
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#iterating-over-xexpressions-and-broadcasting-iterators>Iterating
over xexpressions and Broadcasting Iterators

All xexpressions offer two sets of functions to retrieve iterator pairs
(and their const counterpart).

   - begin() and end() provide instances of xiterators which can be used to
   iterate over all the elements of the expression. The order in which
   elements are listed is row-major in that the index of last dimension is
   incremented first.
   - xbegin(shape) and xend(shape) are similar but take a *broadcasting
   shape* as an argument. Elements are iterated upon in a row-major way,
   but certain dimensions are repeated to match the provided shape as per the
   rules described above. For an expression e, e.xbegin(e.shape()) and
   e.begin() are equivalent.

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#fixed-dimension-and-dynamic-dimension>
Fixed-dimension *and* Dynamic dimension

Two container classes implementing multi-dimensional arrays are provided:
xarray and xtensor.

   - xarray can be reshaped dynamically to any number of dimensions. It is
   the container that is the most similar to numpy arrays.
   - xtensor has a dimension set at compilation time, which enables many
   optimizations. For example, shapes and strides of xtensor instances are
   allocated on the stack instead of the heap.

xarray and xtensor container are both xexpressions and can be involved and
mixed in universal functions, assigned to each other etc...

Besides, two access operators are provided:

   - The variadic template operator() which can take multiple integral
   arguments or none.
   - And the operator[] which takes a single multi-index argument, which
   can be of size determined at runtime. operator[] also supports access
   with braced initializers.

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#python-bindings>Python
bindings

The python bindings are built upon the pybind11
<https://github.com/pybind/pybind11> library, a lightweight header-only for
creating bindings between the Python and C++ programming languages.
<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#example-1-use-an-algorithm-of-the-c-library-on-a-numpy-array-inplace>Example
1: Use an algorithm of the C++ library on a numpy array inplace.

C++ code

#include <numeric>                    // std::accumulate
#include "pybind11/pybind11.h"        // pybind11
#include "xtensor/xmath.hpp"          // C++ universal functions
#include "xtensor-python/pyarray.hpp" // numpy bindings
double sum_of_sines(xt::pyarray<double> &m)
{
    auto sines = xt::sin(m); // sines does not hold any value
    return std::accumulate(sines.begin(), sines.end(), 0.0);
}
PYBIND11_PLUGIN(xtensor_python_test)
{
    pybind11::module m("xtensor_python_test",
                       "Test module for xtensor python bindings");

    m.def("sum_of_sines",
          sum_of_sines,
          "Return the sum of the sines");

    return m.ptr();
}

Python Code

import numpy as npimport xtensor_python_test as xt

a = np.arange(15).reshape(3, 5)
s = xt.sum_of_sines(v)
s

Outputs

1.2853996391883833

<https://gist.github.com/SylvainCorlay/acbb515a5f421897b9254bcb2a65b2b8#example-2-create-a-universal-function-from-a-c-scalar-function>Example
2: Create a universal function from a C++ scalar function

C++ code

#include "pybind11/pybind11.h"
#include "xtensor-python/pyvectorize.hpp"
#include <numeric>
#include <cmath>
namespace py = pybind11;
double scalar_func(double i, double j)
{
    return std::sin(i) - std::cos(j);
}
PYBIND11_PLUGIN(xtensor_python_test)
{
    py::module m("xtensor_python_test",
                 "Test module for xtensor python bindings");

    m.def("vectorized_func", xt::pyvectorize(scalar_func), "");

    return m.ptr();
}

Python Code

import numpy as npimport xtensor_python_test as xt

x = np.arange(15).reshape(3, 5)
y = [1, 2, 3]
z = xt.vectorized_func(x, y)
z

Outputs

[[-1.     ,  0.30116,  1.32544,  1.13111, -0.10315],
 [-1.95892, -0.81971,  1.07313,  1.97935,  1.06576],
 [-1.54402, -1.54029, -0.12042,  1.41016,  1.64425]]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161118/8dc6f21e/attachment.html>


More information about the NumPy-Discussion mailing list