[SciPy-Dev] sparse vectors / matrices / tensors

Yannick Versley yversley at gmail.com
Tue Sep 20 12:06:03 EDT 2011


I have been working quite a lot with sparse vectors and sparse matrices
(basically
as feature vectors in the context of machine learning), and have noticed
that they
do crop up in a lot of places (e.g. the CVXOPT library, in scikits, ...) and
that people
tend to either reinvent the wheel (i.e. implement a complete sparse matrix
library) or
pretend that no separate data structure is needed (i.e. always passing along
pairs of
coordinate and data arrays).

The most obvious response is to point to scipy.sparse, however I ended up
reimplementing a sparse matrix library myself because
- scipy.sparse is limited to matrices and has no vectors or order-k tensors
- LIL and DOK are not really efficient or convenient data structures to
create
  sparse matrices (my own library basically keeps a list of unordered COO
items
  and compacts/sorts them when the matrix is actually used as a matrix)

As a result, I built yet another sparse matrix library, and I was wondering
whether
(i) there's some generic enough data structure that could be a sparse
counterpart
to numpy's ndarray (i.e., good enough for 99% of the people, 99% of the time
-- my
current guess would be that the mutable COO tensor implementation I
currently have,
or something vaguely similar, might actually fit the bill), or
(ii) whether it would make sense to have some conventions for standardized
access
to other people's sparse matrix packages, either by defining a minimum set
of Python
methods that would be useful or by defining some kind of low-level interface
(similar
to Python's buffer interface).

The answers to (i) and (ii) do depend on what people do with sparse
matrices, and I'd
expect people who deal with PDEs to have different needs than people who use
sparse
matrices for co-occurrence graph, or as feature matrix in machine learning,
etc. - so
I'd like to hear from people who have different use cases than I do.

-Yannick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20110920/4a1ff057/attachment.html>


More information about the SciPy-Dev mailing list