[SciPy-user] combine two sparse matrices

Robin robince at gmail.com
Sat May 3 11:16:10 EDT 2008


On Sat, May 3, 2008 at 12:07 PM, Robin <robince at gmail.com> wrote:
> Hi,
>
>  I was wondering what the most (memory) efficient way of combining two
>  sparse matrices would be.
>
>  I am constructing a very large sparse matrix, but due to the temporary
>  memory required to calculate the entries I am doing it in blocks, with
>  the computation of each block done in a forked child process. This
>  returns a sparse matrix of the same dimensions as the full one, but
>  with a smaller number of entries. I would like to add the entries from
>  the block result to the 'master' copy. I can be sure that there will
>  be no overlap in the position of entries (ie no matrix position will
>  be in both sides).
>
>  What is the most memory efficient way of combining these? I noticed +=
>  isn't implemented, but it's not clear how that would work anyway. The
>  best I have done so far is adding two lil_matices (the block is
>  created as an lil-matrix for fancy indexing) A = A + Apartial, but as
>  the master copy grows this means I think that I will need double the
>  final memory requirement for A (to add the last block). Is there a
>  better way of doing this?
>
>  Also, what are the memory requirements for the conversions (.tocsc,
>  .tocsr etc.)? Will that mean I need double the memory anyway?

After reading some more about the different sparse matrix types I am
now trying having both master and block matrices as dok, passing back
dict(Apartial) since cPickle can't pickle dok matrices, and then doing
A.update(Apartial).
This seems to be a bit slower (which is OK) but also the memory used
seems to be growing a little fast - it seems as though something is
not being released (I do del Apartial after the update but I have a
feeling it might be sticking around) and I'm worried the machine will
fall over again before it's finished.

I'd still appreciate some advice as to the best way to do this. I
thought of directly appending to the lists in the lil_matrix, but I
would then have to sort them again and I wasn't sure if the object
array could take resizing like this.

Robin



More information about the SciPy-User mailing list