[Numpy-discussion] copy on demand

Konrad Hinsen hinsen at cnrs-orleans.fr
Mon Jun 17 01:46:08 EDT 2002


> Konrad Hinsen <hinsen at cnrs-orleans.fr> writes:
> 
> [did you mean this to be off-list? If not, please just forward it to the
> list.]

No, I sent the mail to the list as well, but one out of three mails I
send to the list never arrive there at first try... In this case, the
copy sent to myself got lost as well, so I don't have any copy left,
sorry.

> > <rant>
> > 
> > I don't know about the others out there, but I have 30000 lines of
> > published Python code plus a lot of unpublished code (scripts), all of
> > which use NumPy arrays almost everywhere. There are also a few places
> > where views are created intentionally, which are then passed around to
> > other code and can end up anywhere. The time required to update all
> > that to new slicing semantics would be enormous, and I don't see how I
> > could justify it to myself or to my employer. I'd also have to stop
> > advertising Python as a time-efficient development tool.
> 
> I sympathize with this view. However, I think the solution to this problem
> should be a compatibility wrapper rather than a design compromise.
> 
> There are at least 2 reasons why:
> 
> 1. Numarray has quite a few incompatibilities to Numeric anyway, so even
>    without this change you'd be forced to rewrite all or most of those scripts

The question is how much effort it is to update code. If it is easy,
most people will do it sooner or later. If it is difficult, they won't.
And that will lead to a split in the user community, which I think
is highly detrimental to the further development of NumPy and Numarray.

A compatibility wrapper won't change this. Assume that I have tons of
code that I can't update because it's too much effort. Instead I use
the compatbility wrapper. When I add a line or a function to that
code, it will of course stick to the old conventions. When I add a new
module, I will also prefer the old conventions, for consistency. And
other people working with the code will pick up the old conventions as
well. At the same time, other people will use the new conventions.
There will be two parts of the community that cannot easily read each
other's code.

So unless we can reach a concensus that will guarantee that 90% of
existing code will be adapted to the new interfaces, there will be a
split.

>    (or use the wrapper), but none of the incompatibilities I'm currently aware
>    of would, in my eyes, buy one as much as introducing copy-indexing
>    semantics would. So if things get broken anyway, one might as well take

I agree, but it also comes at the highest cost. There is absolute no
way to identify automatically the code that needs to be adapted, and
there is no run-time error message in case of failure - just a wrong
result. None of the other proposed changes is as risky as this one.

>    this step (especially since intentional views are, on the whole, used
>    rather sparingly -- although tracking down these uses in retrospect might
>    admittedly be unpleasant).

It is not merely unpleasant, the cost is simply prohibitive.

> 2. Numarray is supposed to be incorporated into the core. Compromising the
>    consistency of core python (and code that depends on it) is in my eyes
>    worse than compromising code written for Numeric.

I don't see view behaviour as inconsistent with Python. Python has one
mutable sequence type, the list, with copy behaviour. One type is
hardly enough to establish a rule.

> As a third reason I could claim that there is some hope of a much more
> widespread adoption of Numeric/numarray as an alternative to matlab etc. in
> the next couple of years, so that it might be wise to fix things now, but I'd
> understand if you'd remain unimpressed by that :)

I'd like to see any supporting evidence. I think this argument is
based on the reasoning "I would prefer it to be this way, so many
others would certainly also prefer it, so they would start using NumPy
if only these changes were made." This is not how decision processes
work in real life.

On the contrary, people might look at the history of NumPy and decide
that it is too unreliable to base a serious project on - if they
changed the interface once, they might do it again. This is a
particularly important aspect in the OpenSource universe, where there
are no contracts that promise anything. If you want people to use your
code, you have to demonstrate that it is reliable, and that applies to
both the code and the interfaces.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------




More information about the NumPy-Discussion mailing list