"indexed properties"...

Mon May 19 08:48:03 EDT 2008

On Mon, 19 May 2008 06:29:18 -0500
David C. Ullrich <dullrich at sprynet.com> wrote:

> Maybe you could be more specific? Various "positions" I've
> taken in all this may well be untenable, but I can't think
> of any that have anything to do with whether the data should
> be a single list instead of a list of lists.

What's 'untenable' (hey, I tried to get away with a smiley, remember)
is that a matrix is a list of rows. Suppose you do the transpose trick
with the zip(*M) routine, now it's a list of columns. Both views are
equal, there is no getting around the fact that you're holding an
unnatural predisposition towards seeing the matrix as a list of rows,
which it is most definitely not ...

I was holding the brakes for this argument because I realize it's
intuitive and also because Gabriel seems to want a list stay a list if
he assigns something a list. But that's untenable too. Suppose you
assign a column to a list? The list is torn to shreds and placed over
multiple rows.

> (The only way I can parse this to make it relevant is to
> assume that the position you're referring to is that a
> list of lists is better than a single list. If so: First, I
> haven't said that it was. Second, saying "B is untenable"
> is not much of an answer when someone asks why you
> say A is better than B.)

Yes, it was not much of an answer but I was afraid of ending up in
this quagmire. I now see that it is unavoidable anyway if I want to
explain myself. Why couldn't you just see it the same way as me and
leave it at that without waking up all the creatures of hell :-)

> >And to address an 
> >item in a matrix costs two lookups, row and column, while an array
> >needs only one. 
> 
> The phrase "premature optimization" springs to mind.

Well, I really liked your  slicing idea ...

> This is _Python_ we're talking about. Supposing you're right that
> doing two lookups _in Python_ is faster than doing one lookup
> plus the calculuation col + row*width _in Python_, it can't
> make enough difference to matter. In the sort of application I
> have in mind things already happen "instantaneously".

The computation is almost certainly faster. Lookups are expensive.
However I concede the point because we're not supposed to worry about
such stuff. But it *is* a simpler format.

> The point is not to improve on NumPy. Trying to improve on
> NumPy in pure Python code would be silly - if I wanted
> optimized large matrices I'd _use_ NumPy. The point is just
> to give a simple "intuitive" way to manipulate rows and
> columns in small matrices. 

Yes, me too. This is all about intuition.

> So I'm not looking ahead to the future, things are not
> scalable? The thing is not _supposed_ to scale up to
> large matricies. If a person were dealing with large
> matricies then almost all of it would need to be
> rewritten (and if a person were dealing with really
> large matrices then trying to do the thing in pure
> Python would be silly in the first place, and insisting
> on being able to write things like "m.row[0] =
> m.row[1] + m.row[2]" could very well be a totally
> wrong approach to begin with - I'd figure out the
> operations I expected to need to do and write functions
> to do them.)

The reason why I am interested in this is that since I was writing
sudoku algorithms some time ago I have been looking for ways to interact
with data according to different views. I want the data to update even
when I have made changes to them according to another view. In my case
things are even more complex than they are with matrices because I
tend to view sudoku as subcases of binary cubes. Imagine a 3d 9*9*9
chessboard and try to place 81 non-threatening rooks in it. This is not
quite a solution to a sudoku but every sudoku is also a solution to this
problem. 

One of the solution strategies I thought of was forgetting about the 3d
binary cube's content at all, and just update row, column and file
totals (I start with a 'filled' cube and wipe away fields that are
covered by the 'rooks') to drive the optimization. Somehow this seems
possible even though I do not use the cube itself anymore. It just
exists as a figment of my imagination but still it defines the context.

I hope you understand how this was driving me crazy and why I would be
more than happy to return to a safe and sound, actually 'existing' cube,
if only there was a way to access rows, columns and files (for example
sum their elements) as shared data. In the end I realized that
everything I was doing was an abstraction anyway and if that is the
case why not use the simplest possible representation for the data and
let any matrices, rows, columns, files, cubes and so on exist somewhere
higher up in the levels of abstraction of the code.

> Really. In one of the intended applications the matrix
> entries are going to be home-made Rationals. Just
> adding two of those guys takes a long time. It's
> still more than fast enough for the intended application,
> but [oh, never mind.

Too late :-)

> Sorry about the argumentative tone - I _would_ like
> to know which "untenable position" you're referring to...

No, it's no problem. Thanks again for bringing this up. Once I overcame
my initial resistance to bringing up all this old (for me) issues I was
more than happy to share intuitions. I hope this somehow results in a
shared data type. Even just a matrix_with_a_view would be very nice.

P.