[SciPy-dev] In-place operators and casting
Robert Cimrman
cimrman3 at ntc.zcu.cz
Thu Nov 24 06:03:39 EST 2005
Ed Schofield wrote:
> Hi all,
>
> I've been discussing the behaviour of matrix objects with Travis offline
> after I made a rather ugly patch. The problem I was trying to solve was
> the one described by Jonathan Taylor in the [Default type behaviour of
> array] thread, essentially that:
>
>
>
>>>>c = zeros(10)
>>>>c += rand(10)
>>>>c
>
> array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>
> wasn't what he wanted, and he had to spend time figuring out what was
> wrong. My idea was to turn matrices into something more user-friendly
> than arrays for users migrating from Matlab, R, etc. by redefining
> matrices' in-place operators like += to have the same upcasting behaviour
> as the regular operators like +. Then this would be possible:
>
>
>
>>>>b = matrix(zeros(10))
>>>>b
>
> matrix([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
>
>>>>b += rand(10)
>>>>b
>
> matrix([[ 0.80751041, 0.61973329, 0.70726955, 0.94220288, 0.41340826,
> 0.39087675, 0.81454443, 0.25357685, 0.06850165, 0.19652445]])
>
>
> Travis says that he doesn't think it makes sense to in-place cast an array
> (or matrix) to a different type, and that a floatzeros() function could be
> sufficient to avoid the problem above. But I think this only solves one
> instance of a more general usability problem with casting and in-place
> operators.
>
> I see two requirements for an intuitive += operator (and other <op>=
> friends) without any nasty surprises. First, 'a += b' should give the
> same result as 'a = a + b', just more efficiently if possible, and this
> shouldn't eat up the data of unsuspecting users. This isn't true at the
> moment. Second, 'a += b' shouldn't change 'a' to a different object.
> This is true at the moment:
>
>
>>>>a = ones(3)
>>>>id(a)
>
> 140648904
>
>>>>a += ones(3, complex)
>>>>id(a)
>
> 140648904
>
> Another interpretation of this second point is that the type of an array
> shouldn't change once we've declared it. I think this is what Travis is
> reluctant to sacrifice for the sake of the first point.
>
> If a cast from b.dtype to a.dtype can lose information (like in these
> examples) I don't think it's possible for a += b to satisfy both these
> requirements. The meaning of "a += b" is ambiguous: does the user want a
> safe or unsafe cast? I propose instead that we raise an exception:
>
>
>>>>a = zeros(5)
>>>>a += rand(5)
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: array cannot be safely cast to required type
>
> We currently have similar examples of type-checking:
>
>
>>>>array(rand(5), int)
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: array cannot be safely cast to required type
>
>
>>>>a[:] = rand(5)
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: array cannot be safely cast to required type
>
> This would allow Travis to remove another red warning from his book and
> should save users some grief if they haven't read it (or know it and
> still make the mistake).
>
> In some cases the user will know that the operation will result in a
> potentially unsafe cast and will want to proceed anyway. For these cases
> I'd argue that a more explicit notation is no bad thing. Two options
> are:
>
>
>>>>a += cast[int](rand(5))
>>>>a += rand(5).astype(int)
>
>
> Another might be the 'FORCECAST' flag, but I'm not sure whether this is
> still supported.
>
> Comments?
I am not friend of silent upcasting, since often (my :-)) extension
modules require a certain data type. I would prefer throwing the
exception with the possibility of overriding via .astype().
r.
More information about the SciPy-Dev
mailing list