getting a submatrix of all true

John Hunter jdhunter at ace.bsd.uchicago.edu
Thu Jul 3 12:41:02 EDT 2003


>>>>> "Bengt" == Bengt Richter <bokr at oz.net> writes:


    Bengt> If I understand your original optimality definition, that
    Bengt> would be suboptimal.  I.e., how about the 16-element
    Bengt> matrix? (x's mark the corresponding zeroes)

    Bengt> Or do they have to be adjacent?

No, in general they won't be. I missed that one. Just goes to show
that my "solution" is not one, which I knew.  But it does efficiently
deliver good solutions, where good means large but not optimal.

    Bengt> Brute force seems to work on this little example. Maybe it
    Bengt> can be memoized and optimized and/or whatever to handle
    Bengt> your larger matrix fast enough?

Thanks for the example.  Unfortunately, it is too slow even for
moderate size matrices (30,10).  I've been running it for over two
hours for a 30x10 matrix and it hasn't finished.  And my data are
1000x100!

Here is a little code to generate larger matrices for testing....

    from Numeric import zeros, array, Int, put, reshape
    from RandomArray import uniform

    numRows, numCols = 30,10
    numMissing = int(0.05*numRows*numCols) # 5% missing

    X = zeros( (300,))
    ind = uniform(0, numRows*numCols, (numMissing,)).astype(Int)

    put(X,ind,1)
    X = reshape(X, (numRows,numCols))

    # turn it into a list for your code....
    X = [ [val for val in row] for row in X]
    for row in X: print row

Last night, I began to formulate the problem as a logic statement,
hoping this would give me an idea of how to proceed.  But no progress
yet.  But I have come to the conclusion that with P ones, brute force
requires 2^P combinations.  With 1000x100 with 5% missing that gives
me 2^5000.  Not good.  

Thanks for your suggestion, though.  If you have any more thoughts,
let me know.

John Hunter








More information about the Python-list mailing list