[SciPy-dev] feedback on scipy.sparse
Stefan van der Walt
stefan at sun.ac.za
Fri Dec 14 07:31:03 EST 2007
Hi Nathan
On Wed, Dec 12, 2007 at 07:14:49PM -0600, Nathan Bell wrote:
> On Dec 12, 2007 2:28 AM, Stefan van der Walt <stefan at sun.ac.za> wrote:
> > > Also, feel free to respond with any other comments related to
> > > scipy.sparse
> >
> > At the moment, IIRC, functionality for different kinds of sparse
> > arrays are located in the same classes, separated with if's. I would
> > like to see the different classes pulled completely apart, so the only
> > overlap is in common functionality.
>
> Do you mean the use of _cs_matrix() to abstract the common parts of
> csr_matrix and csc_matrix? If so, I recently removed the ifs from the
> constructor and replaced them with a better solution. I think the
> present implementation is a reasonable compromise between readability
> and redundancy. In the past the two classes were completely separate,
> each consisting of a few hundred lines of code, and had a tendency to
> drift apart since edits to one didn't always make it into the other.
> Tim's refactoring fixed this without complicating the implementation
> substantially.
I think _cs_matrix is a good idea: the two classes share similar
storage. Having 'if' statements inside _cs_matrix to check which of
the two formats you are working with, however, would not be a good
idea (but I don't see any of those).
> > I'd also like to discuss the in-place memory assignment policy. When
> > do we copy on write, and when do we return views? For example, taking
> > a slice out of a lil_matrix returns a new sparse array. It is
> > *possible* to create a view, but it gets a bit tricky. If each array
> > had an "origin" property, such views could be trivially constructed,
> > but it still does not cater for slices like x[::2].
>
> That is a hard problem. Can you think of specific uses of this kind
> of functionality that merit the complexity of implementing it? For
> slices like x[::2] you could introduce a stride tuple in the views,
> but that could get ugly fast.
Say a user wants to examine the first 500 rows of his sparse matrix:
x = build_sparse_matrix()
print x[:500]
It seems like a waste of time to make a new allocation (there may not
even be enough memory to do so). Which reminds me, print x[:500]
will yield some description of the sparse matrix. Do we have a way to
print the elements of the sparse matrix?
Are we aiming to support striding on assigments? I.e.
x[::2] = 5
I suspect that will not be worth the trouble, since a for loop can be
used to assign all the elements.
Regards
Stéfan
More information about the SciPy-Dev
mailing list